Obtaining insights from data/events for business and environmental health services
Ph.D. cadidate in Spatial Information Science and Engineering;
Ph.D. in Molecular Genetics & Biotechnology
View My LinkedIn Profile
Project description: Optimization of machine learning and deep learning approaches for predicting stock prices for a selected company. My assumption is that the fluctuation patterns of stock prices vary between companies. Many existing machine learning/deep learning approaches can be applied to a time series data set of stock prices from a specific company with most probably different prediction accuracy and precision. Even the same approach is used if the feature selections or methods of feature engineering are different, and the results would also vary. This project will use a series of machine/deep learning approaches to analyze historic datasets of some selected stocks and choose a best combination of an algorithm, feature selection method, and similarity/distance measure for a specific company’s stock.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
stock_data = pd.read_csv("amzn2014-2019.csv")
stock_data['date'] = pd.to_datetime(stock_data['date'], format = '%m/%d/%Y')
stock_data['close'] = pd.to_numeric(stock_data['close'], errors='coerce')
stock_data['volume'] = pd.to_numeric(stock_data['volume'], errors='coerce')
stock_data['open'] = pd.to_numeric(stock_data['open'], errors='coerce')
stock_data['high'] = pd.to_numeric(stock_data['high'], errors='coerce')
stock_data['low'] = pd.to_numeric(stock_data['low'], errors='coerce')
# Take the average of the low and high of the AMZN stock for the day and the volume of the stocks traded for the day
# as inputs to predict the stock prices.
import math
stock_data["average"] = stock_data.apply(lambda row: (int(row.high) + int(row.low))/2, axis=1)
input_feature= stock_data.iloc[:,[2,6]].values # become array type of data
input_data = input_feature
# plot the data for volume for the Amazon stocks (AMZN) traded for the day
plt.plot(input_data[:,0])
plt.title("Volume of stocks sold")
plt.xlabel("Time (most current-> oldest) (5/6/2019-5/6/2014)")
plt.ylabel("Volume of stocks traded")
plt.show()
# plot the data for the average price for the day the Amazon stock
plt.plot(input_data[:,1], color='blue')
plt.title("Amazon Stock Prices")
plt.xlabel("Time (latest-> oldest) (5/6/2019-5/6/2014)")
plt.ylabel("Average Stock Price for The Day")
plt.show()
For more details see GitHub Flavored Markdown.