Data Analysis Using Python - University of Pennsylvania
Data Analysis Using Python
- a course by the University of Pennsylvania
My dairy about the journey of studies:
- Instructor: Brandon Krakowsky
- Coursera Course - Data Analysis Using Python
- Part of Introduction to Programming with Python and Java Specialization
Data Analysis Using Python
- Part 1 of "Introduction to Programming
with Python and Java Specialization"
- Download and install Anaconda (Python 3 + Jupyter Notebook all at once)
- Coding demo 1: Analyzing the 500 Greatest Albums of All Time
- Coding demo 2:
- Lambda:
- e.g.: Github Module 1 -
- Quiz 1:
- Quiz 2:
- Run Module 1 - Lab 1 - Work with ufo sightings data (hints: better to create your own github account and "fork" the whole project space (as in your own account space) to run.
- (*Step) Clean Data & Deal With Missing Data
- df[-1:]['name'] is the same as that at line #47 above.
- ----- ----- -----
- Computations - Sum( ) and mean( ): Hyperink
- Other Methods - Video Hyperlink
-
- df[-1:]['name'] is the same as that at line #47 above.
- ----- ----- -----
- Computations - Sum( ) and mean( ): Hyperink
- Other Methods - Video Hyperlink
- There are four options, and three of them can run /wo errors:
- .....
- Module 3: aggregate functions
- Above pivot table is created in index, on state and city. For the values to aggregate (合計), specify the column "review_count" as shown above. Also, specify the aggregate function (e.g. sum) (where np is the numpy)
- To segment our results using the "columns" parameter:
- To pass as an argument to aggfunc( ), a dict object containing different aggregate functions to perform on different values: ("review_count" is the key and Numpy sum method is applied to the column "review_count" below)
- Quiz 5: - Summarizing Data
- It splits the data into different groups by 'genre'. Here's why: df.groupby(['genre']) groups the DataFrame by the values in the 'genre' column. It creates separate groups for each unique genre value (e.g., Action, Comedy, Drama, etc.). This allows you to perform aggregate operations on each group separately (like .mean(), .sum(), .count(), etc.)
- Jupyter Notebook magic functions: The magic function %pylab inline allows the pylab library to load and let's our visualization to show up inside of a notebook.
- Module 3: Lab - Summarize Movie & Ratings Data (Jupyter Notebook); Instructions
- Module 3: Histogram Demo
- My Github's Jupyter Notebook File (please fork to your Github account to run)
- Module 3: Quiz 6 - Visualizing Data
- Q1
- Github: Data Analysis Python - Penn
- Nice free tutorials on: W3schools
- Run VS Code in Github.
----- ----- ----- ----- -----
REPL: Using help( ) function / Help Learning
- To confirm sorted() is built-in by checking its availability in an interactive Python session:
Questions to Grok:
- How to show all the all built-in functions and objects?
- How to get help on the documentation for sorted() ~ similar with help(sorted)?
- e.g. help('module')?
留言
張貼留言