Excel and Data Analysiswiclogo

WICshop handout March 2009
Presenter: Anu Vedantham

Description: We will show you how to use Excel to extract meaning from data. We will bring in data from webpages, convert and format data in automated ways, look at ways to make data cleaner and more reliable, graph trends and find meaning in masses of numbers.

Available online at http://wic.library.upenn.edu/multimedia/tutorials/exceldata.html

Questions? Email wic1@pobox.upenn.edu or stop by the Weigle Information Commons. The Library ITA can help with Excel questions.

1. Introductions

1.1. Index card – write down one thing you wish you knew how to do in Excel
1.2. We will use the Office 2007 on the PCs and Office 2006 on the Macs. We will explain how to work across versions and platforms where possible.
1.3. Learning Outcomes:

• Moving around data sets, sorting, and naming ranges
• Data correction and filter use
• Data analysis and pivot tables
• Conditional formatting

2. Basics of moving around the data – Open sampletofix.xls (modified from a sample at contextures.com)

2.1. Mac Keyboard shortcuts: http://tags.library.upenn.edu/makerecord/url/39747
2.2. PC Keyboard shortcuts: http://tags.library.upenn.edu/makerecord/url/39748
2.3. Ctrl and arrow keys: find edges of ranges
2.4. Ctrl and End key – last filled cell on the sheet
2.5. Select a row or column – click on its name (A, 2, etc.)
2.6. Select a sheet – click on the gray cell at top (has a diamond on Mac)
2.7. Shift-click selects adjacent cells
2.8. Ctrl-click selects random cells - can format but cannot copy these
2.9. Copy and paste works with most types of selections
2.10. Paste Special options - helpful for data from web, other programs

3. Sorting

3.1. Sorting looks for continuous ranges - no empty rows or columns. Explore this by adding an empty row and sort. Notice how the rest of your data is ignored. Now fill in one cell on the empty row and sort. Notice how all your data is sorted again.
3.2. Data format - is it a number or does it just look like one? There is one example under unit cost - a green triangle and warning symbol flag this.
3.3. New feature in Excel 2007/2008 - Sorting by format or color - easy way to flag bad data

4. Ranges

4.1. Name a range:

• Excel 2007 PC: Select the data, then click on the Name Box (upper left of screen) and create a name.
• Excel 2006 PC/Mac: Select data, click Insert / Name / Define
• Select the Units column of data and name it “units”.
4.2. Move to a range:
• Excel 2007 PC: Click on the Name Box and pick out its name
• Excel 2006 PC/Mac: Click Edit / Go To and pick out its name
4.3. Use named range in any function easily: Find a blank cell and type in: =average(units)
4.4. How could you use named ranges?

5. Functions

5.1. Quick overview
5.2. Adding in Data Analysis Toolkit provides functions for stats (regression, t-test, etc.)

6. Generating Data automatically - ID numbers

6.1. Add an ID column to the left of the data. (Click in A1, right-click and choose Insert / Entire Column)
6.2. Put the label “ID” in A1. Put the starting ID number 1 in the new A2.
6.3. AutoFill will replicate a pattern if you give it two starting values. So if A2 and A3 contain sequential numbers and you select them and drag down (Click and drag on black cross at bottom right of cell), the pattern will be replicated. Try 1 and 2, and then try 1 and 3. Now try January and February. Now try 2009 and 2010.
6.4. AutoFill can also follow instructions. Put a 1 in A2 and clear the rest of the column. Now try the Fill / Series command.
o Excel 2007 PC: Look for the fill icon on right panel of Home tab on ribbon
o Excel 2006 PC/Mac: Edit / Fill / Series…
6.5. Fill / Series can produce weekdays (for scheduling calendars), growth trends and more.
6.6. When bringing in data from outside sources, it is best to start with a adding a sequential ID column at the left. This protects you in case you damage data during cleanup.

7. Filters

• Excel 2007 PC: Under Data Tab, look for Filter
• Excel 2006 PC/Mac: Data / Filter / Auto-Filter
7.1. Use the filters to find and fix data errors. Each column in the spreadsheet has one error for you to find. Share with your neighbors. Look for crazy values, misspellings, missing data
7.2. Filtering by multiple fields - basic queries
7.3. Number filters - Top 10… options to look at high and low values, percents

8. Pivot Tables

8.1. Close this spreadsheet and open pivot table sample spreadsheet from Microsoft - a much bigger dataset
• Excel 2007 PC: Insert / Pivot Table
• Excel 2006 PC/Mac: Data / Pivot Table Report
8.2. Follow the wizard and choose a new sheet for the pivot table.
8.3. Pivot Tables look very different across versions! The general idea is to choose row variables, and if crosstab is needed, add column variables.
8.4. First, make a pivot table to look at total sales per region. Drag region to the column area and revenue to the data area.
8.5. Next, add product as a column variable to get a crosstab.
8.6. Next add customer as a second row variable. Notice how rearranging the order of row variables affects analysis.
8.7. Pivot Tables are not flat data! Use Paste Special to get result values only
8.8. Taking snapshots of the data – Filtering the pivot table. You can mix and match filters with pivot tables to do powerful analyses on the fly.

9. Conditional Formatting

9.1. Close active spreadsheet and open third sample spreadsheet - adapted from an Amherst College tutorial.
9.2. Select all the data and name it: Hold CTRL and SHIFT, then Right arrow then down arrow. Define the range with name “data”
• Excel 2007 PC: Home / Conditional Formatting
• Excel 2006 PC/Mac: Format / Conditional Formatting
9.3. Simple - try setting positive numbers to green
9.4. New features in Office 2007 PC - try the automated color scale.
9.5. Now try a double format – red for more than 20 degrees and blue for less than -20 degrees. Browse the sheet now and the data is easy to understand
9.6. You can get sophisticated with formulas as well.

10. Reference resources

11. Q&A from index cards