SIMPLERECOMMENDER
Published on PyPi
A python package that makes recommendations simple by recommending items for a specific existing user based on user rating and average rating per item. It is a blend of popularitybased and contentbased systems.
Link to PyPI
Link to a detailed example
DECISION MAKING WITH STATISTICS
Predictive Analytics
In this project, I have used several statistical methods such as hypothesis testing, ANOVA, and chisquare to arrive at business conclusions by summarizing a total of 7 statistical tests. It also covers the simulation of CLT (Central Limit Theory).
Hypothesis tests used:
1. Lefttailed and righttailed hypothesis testing
2. Twotailed hypothesis testing
3. Posthoc test
4. Shapiro test to check normality
5. Levene test to check the equivalence of varience
6. Mann Whitneyu for nonparametric test
7. ANOVA
8. Chisquare
FORECASTING MONTHLY ARMED ROBBERIES IN BOSTON
Time Series Forecasting
Several methods and techniques of time series forecasting are used to forecast whether robberies in Bosted will increase or decrease in upcoming years. Which will help the government and police departments to take measures accordingly.
1. DickeyFuller test for stationarity
2. ACF PACF plots
3. Defferencing to make the series stationary
4. BoxCox transformation
5. Building ARIMA model
6. Hyperparameter tuning
7. Rolling forecasting to capture random variation
8. Exponential Smoothning
FORECASTING MONTHLY ARMED ROBBERIES IN BOSTON
Time Series Forecasting
Several methods and techniques of time series forecasting are used to forecast whether robberies in Bosted will increase or decrease in upcoming years. Which will help the government and police departments to take measures accordingly.
1. DickeyFuller test for stationarity
2. ACF PACF plots
3. Defferencing to make the series stationary
4. BoxCox transformation
5. Building ARIMA model
6. Hyperparameter tuning
7. Rolling forecasting to capture random variation
8. Exponential Smoothning
HOUSE PRICE PREDICTION
Regression Problem
Prices are a good indicator of both the overall market condition and the economic health of a country. The buyers are just not concerned only about the size(square feet) of the house but various other factors play a key role to decide the price of a house/property. Considering the data provided, we are wrangling a large set of property sales records with unknown data quality issues.
Algorithm used:
1. Linear Regression
2. Ridge Regression
3. Grid Search for hyperparameter tuning
Feature engineering:
1. MinMaxScaler
2. StandardScalerFeature selection:
Encoding techniques:
1. OneHot encoding
2. Label encoding
1. Feedforward selection
Model validation:
1. LOOCV (Leave One Out Cross Validation)
2. KFold CrossValidation
Text Summarization using LSTM's
Natural Language Processing
The objective here is to generate a summary for the "Amazon Fine Food reviews" using the abstractionbased and as well as extractionbased text summarization approaches.
Project pipeline

Understanding Text Summarization,

Text preprocessing,

Abstractive Text Summarization using LSTM, ENCODERDECODER architecture,

Web scrape an article using BS4.

Extractive Text Summarization using Transformer
Evaluation Metric for GAN's
Advance Deep Learning
The evaluation of supervised image classification is simple. The projected output must be compared to the actual production. To get this fake(generated) image, though, you use a GAN and some random noise. This created image should appear as authentic as possible. So, how do you measure the reality of this computergenerated image? Or, to put it another way, how can you assess GAN?
One of the most widely used measures for determining the feature distance between real and produced images is Frechet Inception Distance (FID). Frechet Distance is a measure of similarity between curves that takes the placement and order of points along the curves into account. It can also be used to calculate the difference between two distributions.
Real Time Classification of Inddian Car Models
Convolutional Neural Network
The convolutional neural network to predict the Indian car model in a realtime scenario. Can be used on a mobile phone. Transfer learning technique with MobileNets turnes is the best fit model. Use case: Instantly know the car model at your fingertips.
BANK CLIENT CLASSIFICATION
Classification Problem
The classification of clients applying for a loan into bad clients and good clients with respect to the various details regarding the client provided to the bank so the bank could make informed decision to avoid risk of nonrepayment of loan and hence reduce liquid damage to the bank.
3. Grid Search for hyperparameter tuning
Scoring matrics used:
1. AUC Score
2. Precision Score
3. Recall Score
4. Accuracy Score
5. Kappa Score
6. f1score
DeepLearningMiniProjects
Deep Learning
Content:

Classifying Cat/Dog

Forecasting stock price using LSTM

Predicting bank customer churn

Predicting pressure level
Techniques used:
1. NearestNeighbors with cosine metric
2. simplerecommender
3. SVDpp from package 'surprise'
4. Apriori from mlxtend
5. association_rules from mlxtend
Algorithms used:
1. Logistic Regression
2. Gaussian Naive Bayes
3. KNN classifier
4. Decision Tree
5. Random Forest
6. XGBoost
Other techniques:
1. SMOTE data balancing
2. RFE feature selection