Today we’re taking a tour into Data Science to learn a bit more about the field and tools used in this space!
The best way to hone your data analysis skills is consistent, deliberate practice.
One of the best places to acquire data for analysis is Kaggle, so practice your abilities with some Kaggle data sets.
Use the following data set: Video Game Sales - Sales data from more than 16,500 games
vg-stats.In the vg-stats notebook answer the following questions/do the following tasks. Note that the numbers quoted for sales are in the millions, and apply only for those games with over 10,000 sales.:
When you’re done answering all of the questions for each data set, clean up your notebooks leaving only cells that contain relevant data and calculations. Then restart and run your notebook so that the cell numbering is sequential from top to bottom.
Have fun with the data!! Play around a bit, and see if there’s anything else you can/want to do with the info available!
Video Game Stats.No tests are required. But all questions in Feature Tasks and Requirements must be clearly answered in your notebook.
Cycle Share Data set - Bicycle Trip Data from Seattle’s Cycle Share System
In the bike-stats notebook, answer the following questions/do the following tasks:
NOTE: There’s an issue with one of the CSV files. You will need to find a way to handle that error… Google it, and work around it!
Implement a Queue using two Stacks.
Bookmark and Review
Statement on why this topic matter as it relates to what I’m studying in this module:
Pandas provides powerful data structures (DataFrame and Series) for efficient data manipulation and analysis.
Explain the purpose and basic functionality of the Pandas library. What are some common operations that can be performed on data using Pandas, and how do they contribute to data analysis and manipulation?
Pandas is a Python library designed for data manipulation and analysis. Its primary purpose is to provide data structures (Series and DataFrame) that make it easy to handle and analyze structured data. Pandas facilitates tasks such as data cleaning, exploration, and transformation. It excels in handling tabular and time-series data.
Common Operations:
import pandas as pd
# Loading data
data = pd.read_csv('data.csv')
# Handling missing data
data.dropna()
# Selecting data
data[['column_name']]
# Grouping and aggregating
data.groupby('category').mean()
What are the primary data structures in Pandas, and how do they differ in terms of use cases?
Series: A one-dimensional labeled array capable of holding any data type. It is similar to a column in a spreadsheet or a single-column table in a database.
# Creating a Series
series_data = pd.Series([1, 2, 3, 4], name='column_name')
DataFrame: A two-dimensional labeled data structure with columns that can be of different types. It is akin to a spreadsheet or SQL table.
# Creating a DataFrame
df = pd.DataFrame({'column1': [1, 2, 3], 'column2': ['a', 'b', 'c']})
Differences in Use Cases:
Describe the process of loading a dataset into a Pandas DataFrame. What are some common file formats that can be used, and which Pandas functions are utilized to read these formats?
processes:
# Reading data from CSV
data_csv = pd.read_csv('data.csv')
# Reading data from Excel
data_excel = pd.read_excel('data.xlsx')
# Reading data from SQL
data_sql = pd.read_sql('SELECT * FROM table', connection)
Common File Formats:
For this journal entry, no specific prompt other than to start off with “Today I learned…” or “Recently I learned…” and go from there.
All Podcasts from codenewbie.org
Coding Bootcamps and Coding Journeys - Mark Thompson
In this podcast, Mark touched upon coding projects, career guidance, and the future landscape of coding education. He underscored the significance of crafting distinctive and impactful coding projects for portfolios and discouraged generic projects, urging a focus on completeness and tackling real problems with innovative solutions. Mark recommended to proactively manage one’s career, nurturing relationships with managers, and documenting accomplishments for career progression. He stressed the importance of setting one’s pace and avoiding comparisons with others when embarking on a coding journey.
“You can go at your own pace and don’t compare yourself to other people. I spent so much time looking at the people who were just better than me. Like objectively, there are people who are better than you at code. And I would look at those people and feel like, “Ah, if they’re in the job market or if they’re going after this, there’s no way I can make it.” Bump that don’t spend any time thinking like that because like I had mentioned before, that person can only hold one job and there are thousands of jobs. So don’t worry about it. You be your best. Go at the pace that you need to go at to be successful.”
What Your Bootcamp Isn’t Teaching You - Caitlyn Greffly
Caitlyn reflected on the emotional aspects of coding, debunking the stereotype of tech as purely logical and emotionless. She shared her discovery of the creative side of coding, especially in UX/UI design and the joy found in creating aesthetically pleasing interfaces and the sense of accomplishment in completing projects. Which is where I’m at right now exactly.
One part she said bootcamps aren’t teaching is how to handle a company’s existing code. The interview concludes with insights into dealing with legacy codebases, emphasizing the need for patience and adapting to a slower-paced, collaborative work environment.
She discussed her approach to starting a new job in tech. She prefers to get a broad overview initially, understanding the structure of repositories and team focus. Then choose a small, simple task to dive into the code and gain practical experience, using it as a guide to learn about the company’s processes and standards. Caitlyn emphasized the importance of exploring different areas within the first few months to understand how everything fits together. She gave advice on how to deal with tech jargon and advised jotting down unfamiliar terms, asking questions, and utilizing online resources. She recommended a targeted job hunt, especially seeking or have hired bootcampers in the past.
What are Some Common Mistakes People Make When Learning to Code - Ceora Ford
Ceora emphasized the importance of overcoming common mistakes, such as jumping between languages, lack of consistency, and not building or finishing projects. She recommends project-based learning and staying motivated through habits and incentives.
She said “Even if you are a super senior super experienced developer, you’re always going to run into problems. You’re always going to run into bugs and those bugs can make you feel like you don’t know what you’re doing, which contributes to the whole imposter syndrome thing. “I’m afraid to code in front of this person, because what if I really don’t know what I’m doing and they’re going to think I faked my way to this position or whatever?” And that’s been something that I have literally let take over more times than I would like to admit. And it’s something I’ve been actively working on now for the past couple of months. For instance, I was like deathly afraid of coding live for a really long time, which is something you need to do for interviews.” This is me word for word. I really was having imposter syndrome this entire time. I also have social anxiety so I’m insanely anxious to code in front of people.