Enhance Your Data Science Toolkit: Add-Ons and Updates to the Most Commonly Used Tools
Author: Ali Zaidi
As data science has exploded in popularity and use, so have the tools used to solve problems in the domain. Some of these open source programs and programming suites have become extremely popular, and thus developers have designed a whole host of code that might be referred to as add-ons, extensions, and packages to improve functionality and save time for users, both experienced and new.
Both Python and R are widely used partially because they are free, and partially due to the extensive coding libraries developed for data science specifically. The most popular ways to program in either language, especially when learning data science, are through the Jupyter Notebook environment and the RStudio environment, respectively.
The javascript-based nbextensions add-on for Jupyter Notebooks introduces a wide variety of updates to the classic system. A new tab appears at the top when a user first loads Jupyter Notebooks. This tab allows you to pick and choose which add-ons you would like, depending on your use case. The image below demonstrates the extra tab and shows the options for the different specific add-ons. You can read more in the documentation for nbextensions.
I hope that these add-ons to some of the most popular data science software can increase your productivity and reduce frustration as you master and work with Jupyter Notebooks, JupyterLab, and RStudio.
If you are a new programmer, you might find the following add-ons in nbextensions especially helpful.
It’s Time to Get Organized!
The biggest issue with using the Jupyter Notebook system is the lack of organizational capabilities. Luckily, the Table of Contents add-on solves that for you. The table of contents add-on allows a user to organize and access different sections of a notebook. Create headings and sub-headings with ease and navigate through your notebook. Before this extension, I was using CMD + F (Ctrl + F for Windows) to find different sections of my code which was extremely painful. Stay organized and keep yourself from wasting precious time trying to find a chunk of code.
Missing Variables – Reward if Found
The Variable Inspector add-on is a lifesaver when it comes to keeping track of variables. I’ve used it countless times when I have a lot of variables in one notebook. It’s especially helpful when you forget the values or type of a variable that you assigned much earlier in the notebook. Another use case is when you need to experiment and need to create a copy of the variable you’re using. After some hours of working on the code, you might forget what type of data it contains or, if you’ve recently started working through an existing notebook, you might forget which code cells you have already run. Stay sane by always having a reference to the name, type, size, and value of all the variables in a running notebook. The image below demonstrates how we can track all of those different details!
Need a Quick Test Area?
Tired of running code in your main notebook file? Well, here you can use the Scratchpad, a completely separate little testing area to test code on the go. You’ll need this scratchpad add-on because it allows you to get a separate window to pop-up and run code in, one that’s connected to your existing notebook. If you hate having a lot of code cells in your notebook, this will allow you to separate testing from your running notebook. You’ll have access to all the variables, functions, and anything else you’ve run from your main notebook while working in the scratchpad.
Snippets: Do You Re-use a Lot of Code?
Maybe you like to always import the same coding packages (I always need Pandas, Numpy, and Scikit-Learn), and you hate having to dig through an old notebook to grab it. Or maybe you love always having access to certain functions you’ve programmed in the past. You’ll need snippets to save and insert these segments of code. I’ve saved the snippet called example, which prints out the example snippet you see in the bottom code chunk.
Although the Jupyter Notebook environment is one of the most popular ways to learn and use Python for data science, a huge update to the system occurred in early 2018 with the release of Jupyter Lab. This release solved a lot of the development-related critiques of Jupyter Notebook.
JupyterLab allows for multiple panels of information, each of which can be dragged, and split based on user preference. Users can also open a whole host of different features, such as a terminal, console, file browser and file viewer, and easy access to extensions. These features make it much more of a powerhouse functionality-wise and keep it relevant compared to competing Python integrated development environments (IDEs), which are the programs in which people write and edit code. Although the classic Jupyter Notebook system works well, it can’t compete with the additions that JupyterLab brought.
Now that we’ve explored Python, what about extensions and add-ons for R, specifically in RStudio?
RStudio is the most popular IDE for R programming. You may use R over Python due to R’s being more popular in academic environments and research, as well as being more popular with statisticians and scientists.
New and experienced users may not realize that R Studio has the ability to add extensions and add-ons. You can find them in the menu under Tools -> Addins -> Browse Addins. The add-ons for R are mostly documented and posted in the Comprehensive R Archive Network (CRAN), which means these augmentations are at least minimally reviewed and tested. Let’s take a look at some popular add-ins that will save you time and frustration when programming! You can install RStudio Addins, which allows you to browse and quickly download add-ins.
New to Making Visualizations in R?
Do you like using a visual reference when creating visualizations? Try Ggthemeassist as it opens a graphical user interface (GUI) for creating visualizations in R. While you learn the details and functions for producing visualizations, this makes it much easier to edit and finalize a visualization versus manually typing out edits and running code to view edits. I use this if I want to test and see the relationship between different variables without having to keep re-editing the code and running it for each new relationship I want to explore.
Don’t Want to Program?
Try the Radiant add-in, which allows you to use R for business analytics. Want to quickly be able to model and visualize data? Radiant allows you upload data, view it, visualize the data, create quick pivot tables, combine data, conduct different statistical tests quickly like goodness of fit, do univariate and multivariate analysis in an easy GUI that opens up in your browser. It’s extremely powerful and allows you to also output a report in an R markdown notebook file.
I hope that these add-ons to some of the most popular data science software can increase your productivity and reduce frustration as you master and work with Jupyter Notebooks, JupyterLab, and RStudio.
Ali Zaidi is a data scientist at MITRE. He specializes in machine learning and helped launch Generation AI. He has an MS in Data Science from the University of Virginia.
© 2020 The MITRE Corporation. All rights reserved. Approved for public release. Distribution unlimited. Case number 20-3226
MITRE’s mission-driven teams are dedicated to solving problems for a safer world. Through our public-private partnerships and federally funded R&D centers, we work across government and in partnership with industry to tackle challenges to the safety, stability, and well-being of our nation. Learn more about MITRE.
See also:
Getting Students Excited About STEM (and MITRE), with Willie Hill
Is This a Wolf? Understanding Bias in Machine Learning
Building Smarter Machines by Getting Smarter About the Brain
Technical Challenges in Data Science
Upgrading Machine Learning. Install Brain (Y/N)?
When AI and Psychology Meet, Insights Emerge
Creating an AI-Savvy Workforce for a Strong Future
MITRE’s Pearls of Expertise at FIU ShellHacks
Interview with Dr. Michael Balazs on Generation AI Nexus
Interview with Dr. Philip Barry on Blending AI and Education
Interview with Ali Zaidi on Designing Lessons in Artificial Intelligence
The World as It Will Be: Workforce Development Within and Beyond MITRE