Processing data with pandas

Processing data with pandas#

Attention

Finnish university students are encouraged to use the CSC Noppe platform.
CSC badge

Others can follow the lesson and fill in their student notebooks using Binder. Binder can be launched using the interactive code tools (rocket icon) on the lesson page linked below.

In this section we continue with basic data manipulation and analysis methods such calculations and selections using pandas.

Lesson materials#

Materials for this lesson are from Chapter 3 of the forthcoming textbook Introduction to Python for Geographic Data Analysis. In this lesson we cover the materials from Chapter 3.2 (Common tabular operations in pandas). A brief description of the data used in this lesson can also be found below.

Input data: Weather statistics#

Our input data is a text file containing weather observations from Kumpula, Helsinki, Finland retrieved from the Finnish Meteorlogical Institute (FMI):

  • File name: kumpula-summer-2024.txt (have a look at the file before reading it in using pandas!)

  • The file is available in Binder and the CSC Noppe platforms in the L5/data

  • The data file contains observed daily temperatures recorded twice per day (TEMP1: 10AM, TEMP2: 4PM) as well as minimum, and maximum temperatures from Summer 2024 (1.6.2024 - 31.8.2024) recorded from the Kumpula weather observation station in Helsinki.

  • There are 92 rows of data in this sample data set.

  • The data has been derived from a data file of daily temperature measurements downloaded from the FMI data service. The structure of the data in the file has been modified for use in this lesson.