AI for All is for Everybody
Authors: Ali Zaidi and Naz Huq
As graduating seniors at Florida International University (FIU), Charlie Ramirez and Sephora Jean-Mary headed into their final semester aiming to find a real-life business problem they could solve. Their senior design capstone project was supposed to enable these two front-end developers to display what they’d already learned as computer science students. And they could easily have taken a road well-traveled: build a website and prepare for commencement.
Instead, they opted for 5 months of demanding on-the-job training in data science, a field in which they had no prior experience. Their FIU professor, Dr. Masoud Sadjadi, and Naz helped Charlie and Sephora team up with the MITRE Corporation’s Generation AI initiative, which is introducing students from all disciplines to AI and data science and to the role both will play in solving national challenges, such as restarting the U.S. economy.
MITRE and FIU have partnered to promote opportunities for joint research collaborations and to enable talented graduates to consider public service as a career path. According to Dr. Michael Balazs, the Generation AI Initiative Lead, “the MITRE-FIU partnership has allowed us to build a strong relationship around AI that will continue for years to come.”
Designing a Real-Life Learning Experience
Working with Dr. Sadjadi to create a meaningful challenge—beyond just learning to curate data—the students and their Gen AI mentors ultimately chose to examine an issue pervasive in the business world: how to leverage data science to generate market insights from Amazon product data. Amazon lists hundreds of millions of products for sale in almost every country of the world. To increase sales, businesses need reliable data to understand market trends and demand before investing in the release of new products.
As Charlie’s and Sephora’s guides through one of the world’s largest commercial entities—as well as the field of data science and machine learning—we conferred with Joe Garner, an instructional designer at MITRE’s McLean headquarters. Joe and Ali had worked on a lesson with similar requirements for a class in merchandising at Marymount University. The FIU challenge added front-end experience, something relatable to computer science students, as well as real-time web scraping and interactive visualizations.
According to Joe, “We wanted the students to develop the full stack, website and backend, from scratch while maintaining a weekly journal of their progress. It took one meeting for Charlie and Sephora to quickly draft a project proposal. Once we planned our sprints, they were off and running.”
Our premise was that the MITRE team was a client with a problem — how to generate, visualize, and model market insights via Amazon data — and the students’ job was to build us an interactive website, AI for All, to engage small business owners and other curious visitors in learning more about data science.
Beyond Studying Data to Actually Working With It
Working with data means working with tools to both get it and do something with it. Charlie and Sephora started with two new machine learning tools: Scrapy, an open source tool for extracting data from web pages, processing it, and then storing it; and Anvil, a full stack web app development tool. Both tools use Python, a programming language beloved for its simple programming syntax, code readability and user-friendly commands. As the students prepared the data in Python, they removed the weird symbols and extra spaces that would get in the way of analysis, reformatted text, and more. Ali then showed them how to output the results as a csv file and visualize the different Amazon products and search terms.
They tested the web scraper by searching for consumer items such as Nike Shoes. The scraped data consisted of data points such as the product’s title, the average rating given to the product, and the sales rank. After collecting, Charlie and Sephora cleaned, formatted, and displayed their data on the website they built—AI for All—to show their audience how scraping works and the results that are possible.
As AI for All itself advises, “the true value [of data] comes from being able to make accurate predictions for future products.” To that end, Ali taught them how to model what they had collected. After frequent meetings and demos, the students used Anvil to build a web application that displays the results and process of using a web scraper, and then cleaning and modeling the data it yields.
Using Data to Pay it Forward
Artificial intelligence, data science, and web development are highly technical endeavors, no question. But if we circle back to the original problem—using machine learning and AI to level the playing field for small businesses—the point isn’t the technology itself. It’s the good that comes from it—the public service. Ultimately, the purpose of AI for All is the same as the purpose of Generation AI Nexus: to introduce a community of curious, engaged learners to the power of the technology to solve challenging problems in a complex, changing world.
As Sephora put it: “We created an open source project that serves as the beginning of something with the potential to become incredibly valuable to a great number of people.”
In a busy world, that value comes from making it possible to do a better job of visualizing the data that enables the insight. Together, we built a heatmap to communicate market trends, a word cloud to show which terms were successfully capturing the attention of customers, scatterplots, and other graphs to display comparisons in the AI for All Visualizations app.
Looking Back and Looking Forward
Charlie and Sephora brought a lot to the table with their skills in front-end web development, so they were good teachers as well as good team-mates. It was interesting to see how people outside of the field of data science, but still technically competent in programming, approached data science from their lens. Both students learned quickly how to apply their programming skills to the data science workflow.
Afterwards, Charlie told us, “Throughout the time I spent studying at FIU, I was fortunate to have many great experiences. I became proficient in various programming languages as well as in techniques that helped me become a more efficient programmer. I developed problem solving skills and began to think in a way that allowed me to code more effortlessly. While each project taught me many things, none compare to the benefits and value I received from participating in this project with MITRE and Gen AI.”
At the end of the semester, the students demoed Machine Learning for Small Business Merchandisers virtually and got an A on their capstone. How did our team members grade our collaboration? AI+.
Ali Zaidi is a data scientist at the MITRE Corporation. He specializes in machine learning and helped launch Generation AI. He has an MS in Data Science from the University of Virginia.
Naz Huq is majoring in computer science at FIU, with the goal of becoming a software engineer. He worked as an intern at The MITRE Corporation during the spring 2020 semester. He enjoys traveling and sampling international cuisines.
© 2020 The MITRE Corporation. All rights reserved. Approved for public release. Distribution unlimited. Case number 19-01826-6
MITRE’s mission-driven teams are dedicated to solving problems for a safer world. Through our public-private partnerships and federally funded R&D centers, we work across government and in partnership with industry to tackle challenges to the safety, stability, and well-being of our nation. Learn more about MITRE.