AI Practice Manual for Beginner and Intermediate AI Engineers

Admin
Jan 13, 2023
6 min read

By Dr Mabrouka Abuhmida

This manual provides examples of deep learning and machine learning concepts with accompanying datasets. It is a guide for beginner and intermediate level AI engineers to practice on AI projects and datasets.

Note: This manual outlines my learning progress and trajectory.

I have organized the projects into seven main categories, however, some of the projects may overlap and fit into multiple categories.

Computer vision
Natural language processing
Voice recognition and nlp
Anomaly detection
Time series analysis
Medical diagnosis
Behaviour. and automation analysis

Additionally, I have developed two starter kits to provide an introduction to the concepts of these projects based on their type.

Classification
Regression

How to use this manual:

To maximize the benefits of this manual, it is recommended to begin with the starter kit projects and then proceed to work through the projects in the desired category to build your skills and knowledge.

Classification starter kit projects:

Classifying plant species using the Fisher’s Iris dataset.
Classifying human activity using the Human Activity Recognition dataset
Classifying wine quality using the Wine Quality dataset
Classifying sentiments in tweets using the Twitter Sentiment Analysis dataset
Classifying images of clothing as different types of apparel using the Fashion MNIST dataset and the hand written numbers using the MNIST data set.

Regression starter kit projects:

Predicting forest fires using the Wildfire dataset
Predicting the likelihood of customer conversion using the E-Commerce dataset
Predicting. the likelihood of a customer making a purchase using the Online Shoppers Intention dataset.

Here is few recommendations to help you optimise your time learning AI using Python:

1- Start with a tutorial or online course: There are many resources available online that can help you get started with Python. Look for a tutorial or course that covers the basics of the language and helps you get comfortable with writing and running Python code.

2- Experiment with code: As you learn new concepts and techniques, try writing code to see how things work in practice. This will help you understand the concepts better and give you a chance to apply what you’ve learned.

3- Practice, practice, practice: The more you practice coding, the more comfortable you’ll become with the language and the easier it will be to write and understand code.

4- Keep up to date: Python is a constantly evolving language, with new versions and features being released regularly. Stay up to date with the latest developments in Python by following online communities, blogs, and forums for Python programmers.

5- Get involved in the Python community: There are many online communities and forums for Python programmers, as well as in-person events and meetups. Participating in these communities can be a great way to learn from others, ask questions, and stay motivated as you learn Python.

I will leave you know with the projects and practice examples;

A. Computer vision

Image Classification: Use a dataset of images (such as the CIFAR-10 or ImageNet datasets) and train a deep learning model to classify the images into different categories.
Object Detection: Use a dataset of images with annotated objects (such as the PASCAL VOC and COCO dataset) or bounding boxes around objects and train a deep learning model to detect and classify the objects in the images.

B. Natural language processing

Sentiment Analysis: Use a dataset of text reviews (such as the IMDB movie review dataset) and train a deep learning model to classify the reviews as positive or negative.
Machine Translation: Use a dataset of parallel text in different languages (such as the WMT English-French translation dataset) and train a deep learning model to translate the text from one language to another.
Language translation: Use a dataset of texts in one language and their translations to train a model to translate new texts from one language to another. Examples of datasets for this type of project include the IWSLT dataset and the WMT dataset.
Text Generation: Use a dataset of text (such as the wikipedia articles dataset) and train a deep learning model to generate new, coherent text based on the patterns in the data.
Text classification: Use a dataset of texts and their labels to train a model to classify new texts into predefined categories. Examples of datasets for this type of project include the 20 Newsgroups dataset and the IMDB movie review dataset.
Handwriting Generation: Use a dataset of handwriting samples and train a deep learning model to generate new handwriting samples that mimic the style of the original samples.
Chatbot: Use a dataset of conversations and train a deep learning model to respond to user input in a natural, conversational manner.

C. Voice recognition and nlp

Music Generation: Use a dataset of music compositions and train a deep learning model to generate new, original music based on the patterns in the data.
Speech recognition: Use a dataset of audio recordings and their transcriptions to train a model to transcribe new audio recordings. Examples of datasets for this type of project include the CommonVoice dataset and the LibriSpeech dataset.

D. Anomaly detection

Credit Risk Prediction: Use a dataset of credit application data and train a deep learning model to predict the likelihood of a credit applicant defaulting on their loan. For example predicting the likelihood of loan default using the Lending Club dataset.

Anomaly detection: Use a dataset of normal and anomalous examples to train a model to detect anomalous instances in new data. Examples of datasets for this type of project include the Numenta Anomaly Benchmark dataset and the Yahoo S5 dataset. Classifying spam emails using the Spambase dataset.

Fake News Detection: Use a dataset of news articles and train a deep learning model to identify fake news articles. For example classifying news articles into categories using the 20 Newsgroups dataset

Fraud detection: Use a dataset of normal and fraudulent transactions to train a model to detect fraudulent transactions in new data. Examples of datasets for this type of project include the Credit Card Fraud Detection dataset and the Financial Fraud Detection dataset.

E. Time series analysis

Customer Segmentation: Use a dataset of customer data (such as demographic information and purchase history) and train a deep learning model to segment customers into different groups based on their characteristics.

Time series prediction: Use a dataset of time series data to train a model to predict future values in the series. Examples of datasets for this type of project include the Numenta Anomaly Benchmark dataset and the Solar Energy dataset.

Stock price prediction: Use a dataset of historical stock prices to train a model to predict future stock prices. Examples of datasets for this type of project include the Yahoo Finance dataset and the Google Finance dataset.

Sales Forecasting: Use a dataset of sales data and train a deep learning model to forecast future sales. For example predicting house prices using the Zillow dataset.

Energy Consumption Prediction: Use a dataset of energy consumption data (such as temperature and occupancy levels) and train a deep learning model to predict energy consumption in a building.

F. Medical diagnosis

Disease Diagnosis: Use a dataset of medical images and train a deep learning model to diagnose diseases based on the images.
Predicting diabetes: Use the Pima Indians Diabetes dataset from the UCI Machine Learning Repository and build a model to predict whether an individual has diabetes based on various features such as age, blood pressure, and body mass index (BMI). You could use a decision tree or a random forest for this task.
Predicting heart disease: Use the Cleveland Clinic Foundation Heart Disease dataset from the UCI Machine Learning Repository and build a model to predict the presence of heart disease based on various features such as age, blood pressure, and cholesterol levels. You could use a logistic regression or a support vector machine for this task.
Predicting breast cancer: Use the Wisconsin Diagnostic Breast Cancer dataset from the UCI Machine Learning Repository and build a model to predict whether a breast cancer is benign or malignant based on various features such as the size of the tumor and the number of benign cells. You could use a decision tree or a random forest for this task.
Predicting COVID19: using the covid19 data set you can predict if the patient has or had covid based on their lung CT scans.
Protein Structure Prediction: Use a dataset of protein sequences and train a deep learning model to predict the 3D structure of a protein.

G. Behaviour and automation analysis

Traffic and accident Prediction: Use a dataset of traffic data (such as traffic counts and weather conditions) and train a deep learning model to predict traffic patterns. For example Predicting traffic accidents using the Accidents in New York dataset.
Recommendation systems: Use a dataset of user interactions with items (such as clicks on a website or purchases in an online store) to train a model to recommend new items to users. Examples of datasets for this type of project include the MovieLens dataset and the Amazon Reviews dataset.
Sentiment Analysis for Social Media: Use a dataset of social media posts and train a deep learning model to classify the sentiment (positive, negative, neutral) of the posts. For example Classifying movie reviews as positive or negative using the IMDB dataset.
Customer segmentation: Use a dataset of customer data to train a model to cluster customers into different groups based on their characteristics. Examples of datasets for this type of project include the Wholesale Customers dataset and the Online Retail dataset.
Predicting the likelihood of customer churn using the Telco Customer Churn dataset. Also predicting customer spending using the Online Retail dataset
Supply Chain Optimization: Use a dataset of supply chain data (such as inventory levels and delivery times) and train a deep learning model to optimize the supply chain.
Predicting the likelihood of an employee leaving the company using the HR Attrition dataset
Predicting equipment failures: Use an equipment failure dataset and build a model to predict the likelihood that a piece of equipment will fail based on various features such as usage, age, and maintenance history. You could use a decision tree or a random forest for this task.

Dr. Mabrouka Abuhmida

Educator

AI Practice Manual for Beginner and Intermediate AI Engineers

Recent Posts

1 Comment