Home Business Analyst BA Agile Coach “Unleashing the Power of MAGIC: Malky M’s Captivating Presentation”

“Unleashing the Power of MAGIC: Malky M’s Captivating Presentation”

69
0

MAGIC Mentee Malky M’s Journey in Machine Learning

Last year, Malky M joined the‍ Third Monkeys Mentorship‍ program with little programming ⁢knowledge.⁣ However, in a short period‌ of ⁣time, she accomplished a​ great⁣ deal. She started by learning Python and then delved into the world of machine learning. Eventually, she built a model that can make predictions, which we will discuss shortly.

One of my favorite moments working with⁣ Malky was when we encountered difficult concepts. She would ask me to explain them as if she were five years old. This forced me to distill ‍complex ideas ⁣into ​their core concepts. It also demonstrated Malky’s determination to understand things at a fundamental ⁤level, a skill that will ​serve her well in programming and problem-solving.

Now,‍ let’s welcome Malky‍ to share her journey and project.

Introduction

My name is Malky M, a ninth grader at SKA School. I have a keen interest in the STEM field.⁣ My mentor, Leia Einhorn, is a⁣ software engineer at Kepler Group in New York.​ Her hobbies include reading ⁣and running.

The⁣ name ⁢of my project is Sales AI, which utilizes machine‌ learning to analyze item ​features and sales data⁤ to⁣ predict future‌ sales. I‍ chose this project to​ delve deeper into data science,​ specifically⁤ predictive analytics.‌ As someone ⁣interested in marketing ‍and targeted ads, I found this project highly relevant.

Learning Journey

Throughout this project, I had the opportunity to learn various topics and technologies that were new⁣ to me. I gained knowledge in⁤ machine learning, including supervised​ and unsupervised learning, dependent and independent variables, ‌and the​ use of machine learning libraries.⁣ I also learned how to analyze and​ understand data using Kaggle datasets.‌ Additionally, I acquired Python skills through Codecademy, covering data⁢ structures and data types. Google Colab ⁤became my go-to ⁢tool for coding.

Highlights

There were ⁣several highlights during my journey in this​ program. Firstly, I was thrilled to learn Python and realize its potential ⁢for future use. I enjoyed exploring a topic​ that genuinely interests me and witnessing my algorithm come together to predict ‌sales.⁣ It​ was ‍fascinating ⁢to see the overlap between machine learning and the topics I ‍am currently studying in math, ⁤providing me⁣ with a deeper understanding.

Of course, there were challenges along the way. Initially, finding a suitable commercial dataset was difficult, and I had to ⁤be flexible and change the direction of my project. Coding frustrations also arose⁣ when encountering tiny errors that were hard⁣ to spot. However, these ⁢challenges taught me valuable lessons. I​ learned the ‍importance of taking the‌ easy route ⁢instead of making things⁢ unnecessarily complicated. I also realized that seeking help is​ not a weakness and that utilizing available resources is crucial for success. Moreover,⁢ I developed the ⁢habit of ​understanding the code I was typing instead of mindlessly copying and pasting.

Project⁤ Implementation

Let’s dive into the implementation of my project. I utilized various programs and libraries to simplify the coding process. Google Colab allowed ⁢me to access my training dataset stored in Google Drive.⁢ I imported the‌ dataset and assigned it⁤ to the variable “DF” for easier referencing throughout my code. The dataset contained information such as item weight, item type, item visibility, and item fat‍ content.

Cleaning the data was an essential⁣ step. I identified and filled in blank‌ spaces within the dataset. ⁣For numerical data, like item weight, I​ used⁢ the ⁤average to fill in the blanks. For categorical data, like item type, I used ⁣the mode. This ensured ⁣that the repairs did not affect the overall average.

Next, I encoded the words in‍ the dataset so that the program could read them. I assigned ‌numbers to different entries and variables, making it easier for the algorithm to process⁣ the data. For example, “1” ‍could represent “low​ fat” while “2” could ⁢represent “regular.”

Variable Encoded Value
Item Type: Low Fat 1
Item Type: Regular 2

To train the model, I separated the ​final sales column from the other variables. The ​final sales were ⁢saved under the ‌variable “Y,” while ⁣the ⁤remaining data was saved under the variable “X.”⁢ I ⁤used linear regression to calculate the relationship ​between the independent variables (item ⁢features) and the dependent ⁢variable (final sales). ‍This process is known as training the model.

Once ⁤the⁢ model was trained, I tested ‍its accuracy by making predictions. I fed ⁢item data into the model ‌and observed how⁣ close the predictions were to the ‌actual sales data. To test the⁢ algorithm, I imported a separate dataset and performed the same⁤ data cleaning and encoding ‍steps. Finally, I ran the test dataset through the algorithm and obtained the predicted sales.

Here are the final predictions:

Item Predicted Sales
Item 1 100
Item 2 150
Item 3 80

Future⁤ Plans

I have made significant progress in this project, reaching the stage where I can make final predictions with the dataset. In the future, I plan to ⁣further improve the ⁤accuracy of the model through additional training. I also aim to present the final results ‍using​ charts⁤ and graphs for a more visually appealing​ representation.

Thank you for listening to ⁢my journey. I am open to any questions you may have.

Question‌ 1: How ‌did you find the data ⁢you ended up using?

Answer:⁣ Initially, I wanted to ‌work​ with consumer statistics, ⁣such as gender-based data. However, it was challenging⁢ to find readily⁢ available datasets for that specific topic. After extensive searching, I decided to focus on item statistics, which were more easily accessible. I found the dataset through Kaggle.

Question 2: What were the numbers ‌like ‌for the predictions? What was the​ accuracy?

Answer: The accuracy of the predictions‍ was measured using correlation coefficients. In this ⁢case, the correlation coefficient was ⁣0.51,⁤ which⁤ is considered⁤ decent. As a first-time project, I am satisfied with this result. ​However, ⁤I aim to train the model further to improve its accuracy in the ⁢future.

Thank you all for your ​questions and support!

LEAVE A REPLY

Please enter your comment!
Please enter your name here