[Tutorial] Ten concepts that every Calc user should know

Forum rules
No question in this section please
For any question related to a topic, create a new thread in the relevant section.
Locked
User avatar
MrProgrammer
Moderator
Posts: 4883
Joined: Fri Jun 04, 2010 7:57 pm
Location: Wisconsin, USA

[Tutorial] Ten concepts that every Calc user should know

Post by MrProgrammer »

If some of the tutorial is unclear, just keep going. It may make more sense on the second or third reading.

0. Principal Components
A spreadsheet is a collection of sheets. Each sheet is a rectangular grid of cells. Current versions of Calc provide large numbers of rows and columns, far more than most people will ever need. Unused rows/columns are not stored when you save the spreadsheet, so there no need to "delete" them to conserve space. Rows are 1, 2, 3, … and columns are A, B, C, …, Z, AA, AB, AC, …, AZ, BA, BB, …, ZZ, AAA, AAB, …. It is not possible to begin numbering the rows at zero. All of the cells in a column must be the same width; all of the cells in a row must be the same height. For several features of Calc to work well, you should use the first row of your sheet to hold column labels. That is, if a column will hold numbers representing expenses, put the word "Expense" in row 1 and begin entering your data in rows 2, 3, and so on. Select cell A2 and use Window → Freeze so that your labels remain visible as you scroll down in the spreadsheet.

1. Types of data
Each non-empty cell can contain a number or a text string. If you plan to do numeric calculations with your cells, you should ensure that their values are entered as numbers and not as text because Calc's ability to perform arithmetic on text is somewhat limited. The number 25 is different than the 2-character text string 25; entry of numbers or text is described in 5. Understanding data entry below. You can check whether a cell contains a number or a text string using View → Value Highlighting. Numbers will be blue. Turn off the Value Highlighting feature if you aren't using it because it can override your formatting settings. The special values TRUE and FALSE are really numbers. TRUE has the value 1 and FALSE has the value 0. Conversion of cell values from text to numbers, or vice versa, is explained in section 7. Analyzing and modifying your data below.

2. Controlling how data is displayed
You can choose how the numbers in your cells are displayed using the Format → Cells → Numbers dialog or with styles. These dialogs group the formatting choices into broad Categories (Number, Percent, Currency, …) and specific Formats. For example, you can format numbers with more decimal places, with digit grouping characters (either 1,234,567 or 1.234.567 depending on your locale), or with a currency symbol (like $ or €). Very large and very small numbers can be displayed in Scientific notation as d…Ennn (this represents d… times 10ⁿⁿⁿ) or d…E-nnn (this represents d… divided by 10ⁿⁿⁿ). No matter which Category/Format you choose, the cell's value contains the pure number, not the grouping characters or the currency symbol. The dialog only changes how values are displayed on the screen, not how they are stored in the cell. Changing the number format does not affect the cell's value. Therefore you cannot convert a text string into a number, or vice versa, by changing the cell's number format. Numeric calculations are done with the cell's value, not with the number's display format, unless you enable option OpenOffice.org Calc → Calculate → Precision As Shown. Access the options dialog using OpenOffice.org → Preferences on a Mac or Tools → Options on other platforms. If the cell's width is too small to hold a number's formatted value, Calc will display ###. You fix this by making the cell width wider; double click the separator bar on the right side of the column name at the top of the sheet. The Edit → Find & Replace dialog looks at the pure values, not the display format, so searches for the grouping characters or the currency symbol will fail. You can do arithmetic with numbers that show a currency symbol because the symbol is not part of the value. Your default locale (language and country) is established when you install Calc and the other programs in the suite. It controls many of the displays in Calc. You can change your locale using option Language Settings → Languages → Locale setting.

3. Dates in cells
A cell can contain a date but the date is really stored as an integer. When you type a date in a cell, Calc will determine the correct integer automatically, but will display it as a date. You cannot change the default date format for a locale, but you can change your locale to one where local customs match your preferences. For example if you pick locale English (USA) dates are MM/DD/YY and if you pick English (UK) dates are DD/MM/YY. Or, if you prefer a date format that is not the default for your locale, the integer can be displayed in many different date formats using the Format → Cells → Numbers → Category → Date dialog. You can format empty cells, or an entire column, to your preferred date format with the dialog or with styles so that dates will be displayed as desired when they are entered. Each date's integer is one larger than the integer for the previous day, across weeks, months, years, and leap years, so you can obtain the number of days between two dates by subtracting them and displaying this difference as a number. To enter the current date in a cell, type the day number followed by your locale's date separator. For example enter 31/ on Halloween. Spreadsheets do not provide a simple way to designate a cell as a timestamp, that is, a cell which will contain the date that another cell in the row is modified.

4. Times in cells
A cell can contain a clock time, but the time is really stored as the corresponding fraction of a day. For example 6 AM is stored as 0.25 because it is ¼ of the way from midnight to midnight, but it is displayed as 6:00:00 AM. You can use Format → Cells → Numbers → Category → Time to display clock times in many formats, for example H:MM or HH:MM or HH:MM:SS, and any of these can have an optional AM/PM if you don't want to use 24-hour (military) time. A cell can contain both a date and a time. The value in the cell is the sum of the integer for the date and the fraction for the time. Format date/time cells by combining a date format and a clock time format. For cells which represent durations (difference of two clock times), the value could be more than one day, that is larger than 1, or even negative. The previously mentioned clock time formats display only the positive fraction; Calc adds or subtracts whole days until the time is between 00:00:00 and 23:59:59. To display a duration which exceeds 24 hours or is negative, you must enter the first part of the time format in brackets in the Format code field: [HH]:MM:SS or [HH]:MM or [HH] or [MM]:SS or [MM] or [SS]. There is no cell format which displays durations as days with hours/minutes/seconds. Be sure to remove the AM/PM format code from duration cells since they don't represent clock times, and always use one of the bracketed formats.

5. Understanding data entry
While you can control the output format for a cell using Format → Cells → Numbers → Category, the interpretation of data typed into a cell is always based on the customs for your locale. The cell's output format is ignored as you enter data, unless the Category is Text. Anything typed into a Text cell is entered exactly as typed and it will be stored as a string, never as a number, and never as a formula. Postal codes (ZIP codes) are not numbers so set the Category to Text before entering them. For a Category other than Text, Calc uses your locale to interpret what you entered. A cell's default is Category → Number with Format → General. If the cell is set to this default, when you enter data Calc will set the Category to match what you typed. The following examples use the English (USA) locale.
  • If you enter +00012 Calc will decide that the cell's value is 12 and provide the default numeric formatting.
  • If you enter 125% Calc will decide that the value is 1.25 but you want it displayed as a percentage (Category → Percent).
  • If you enter $5 Calc will decide that this is the value 5 and display it with a currency symbol on the left and two decimal places (that is, $5.00).
  • If you enter .7E4 Calc will decide that this is scientific notation for the value 7000 and display it as 7.00E+003.
  • If you enter 0 1/2 Calc will decide that this is the value 0.5 and will display it as the fraction 1/2.
  • If you enter 1/2 Calc will decide that this is January 2nd of the current year and display it as 01/02/11. You can enter a date with an alphabetic month name or abbreviation and a 4-digit year, like 7-Dec-1941, so Calc will be able to determine the correct date even if the date separator and year-month-day order do not match the locale's convention. You can enter dates in YYYY-MM-DD format in any locale. Note that Calc will use this locale's date convention of month before day, so that 1/2 is January 2nd even if the cell was previously formatted as DD/MM/YY; but the format is effective for the output display and the cell will show 02/01/11.
  • If you enter 24/ Calc will decide that this is the 24th of the current month and will display it as 03/24/11. This is an easy way to enter the current date if you know the day of the month.
  • If you enter 1:23 Calc will decide that this is a time in hours and minutes. To enter 1 minute and 23 seconds, type 1:23.0 or 0:1:23.
  • If you enter 1/2 3:45 Calc will decide that this is a date/time value.
  • If you enter data in a format that Calc cannot recognize, Calc will store it as a text string. For example €5 is stored as a string because the Euro symbol isn't recognized as currency in the USA locale. You can always force Calc to store numbers as text by prefixing them with a single quote (apostrophe); the prefix is not part of the cell's value so '1/2 is stored as a three-digit string and will not be converted to a date.
6. Saving your data and making backups
After you build your spreadsheet you will want to save it to a file. Calc gives you a choice of many file formats. The best format is the ODF Spreadsheet which will create {YourFile}.ods (Open Document Spreadsheet). Saving in other formats risks loss of features or formatting. If you want to send this spreadsheet to someone who uses Excel, save first in ODS format, then use File → Save As and select File type Excel 97/2000/XP to create {YourFile}.xls for your colleague. If you receive a spreadsheet created by Excel keep in mind that while most Excel features have an equivalent in Calc, complex sheets cannot always be translated perfectly. If you make changes to an Excel spreadsheet, save to ODS format for yourself, not directly back to XLS format. To save your spreadsheet to a an external device (like a USB key), it is safer to save to your computer's disk drive first, then use your operating system tools to copy the file to the external device. When working with a valuable spreadsheet it is imperative to make periodic backup copies in case disaster strikes. Using File → Save every five minutes is not making backup copies because only one copy exists and if your laptop battery goes dead you may find, as others have, that your only file was destroyed when power was lost. Saving copies to external devices or sending files as attachments to a web-based email system protects you in case of disk drive failures. Enabling option Load/Save → General → Always create backup copy is highly recommended but this should not be regarded as a substitute for making periodic copies yourself, because these automatic backups are still on your disk drive. You can put a password on your file using the File → Save As dialog, but if you forget the password you used, your data is gone forever.

7. Analyzing and modifying your data
The Data menu will help you to analyze and modify your spreadsheet. You can use the Data → Sort dialog to rearrange rows in a desired order. Calc assumes that you want to keep all the items on the row together during sorting, though you can sort an individual column independently by making the proper selections in the sort dialog boxes. If your data has labels in the first row, Calc understands that you probably do not want that one sorted. Sorting rows is more common, but it is possible to request a left-to-right sort of columns. For numeric data, you can summarize cells by groups to get a sum, count, average, etc. using the Data → PivotTable dialog (formerly called DataPilot) where you indicate which columns contain the group names and which contain the data to summarize. Multiple levels of groups are supported and groups can be displayed horizontally, vertically, or both. You can use the Data → Filter dialog to display subsets of your data. Putting all your data in one sheet and filtering can be a good alternative to splitting it into separate sheets. In general, it is easier to analyze data when it's all together. For example, it will be difficult to use a PivotTable if the data is stored on multiple sheets. When data is filtered, the non-selected rows are still there but are hidden. Hidden rows do not participate in operations like cell format changes, cell content erasure, or deletion of rows, thus it can be useful to hide the things you want untouched, make changes, then remove the filter. Data → Text to Columns will help you separate text at delimiters (or by position), so you can convert the three rows "Apples:50" "Bananas:2" "Cherries:20" into two columns so the numbers can be summed. You can also use this feature to convert strings to numbers, dates, or times and vice versa. See any of these topics in Help → Index for details. Anytime you change cell contents or formatting, whether by typing or by using menus, you can reverse the change using Edit → Undo. So whenever something unexpected happens, immediately use Undo. Multiple Undo's are supported. If you don't like the result of Undo, you can undo the Undo with Edit → Redo.

8. Using formulas and cell references
You can use formulas to perform calculations on cells. When you select a cell you can enter the formula in the "formula bar" at the top of the screen. The formula specifies the value to be displayed in the cell where it is entered. A cell's formula can never alter the content in a different cell. After entering your formula you will see the result in the cell and the formula in the formula bar. Formulas begin with an equal sign, and for numeric values you can add, subtract, multiply, or divide using the operators +, -, *, and /. Multiplication and division are performed before addition and subtraction unless you use parentheses, as in =(cell+cell)*cell. You can join strings (constants or cell constants) with the & operator, as in ="Important: "&cell. If you join numbers with &, Calc assumes that you want to join the strings representing those numbers. You reference a specific cell by using the column letter and row number each preceded with a $, as in $P$5. But in most cases we really want to reference some nearby cells. For example we may have units sold in column A and prices in column B. We can to calculate total cost in cell C2 using the formula =A2*B2. By omitting the $ we are making relative (not specific) references to those cells, that is, we are referencing the two cells to the left of C2. How is that different than =$A$2*$B$2? Consider what happens if we copy and paste the formula =$A$2*$B$2 from C2 to C3. It references the specific cells A2 and B2 so the formula in C3 will also be =$A$2*$B$2. But if we copy/paste =A2*B2 from C2 to C3 it means "the product of the two cells to the left" and the formula in C3 becomes =A3*B3. In C2's formula, A2 means "two cells to the left", B2 means "one cell to the left", F2 would mean "three cells right", C3 would mean "one cell down", D3 would mean "one cell down and one cell to the right", and so on. In C2's formula, $A2 would mean "this row column A", C$5 would mean "this column row 5", and B$5 would mean "row 5 one column to the left". The simplest way to copy formulas to nearby cells is using the fill handle. As an example, put 0 through 9 in cells B1:K1 and also in A2:A11. Type =$A2*B$1 in B2 (using $ twice as shown) and press Enter. You'll see a small black square (the "fill handle") in B2's lower right corner. Drag the handle to the right, stopping in K2 and release the mouse. Leaving the ten cells highlighted, double click the fill handle to fill the formulas down to match column A. You've built a multiplication table. Inspect these cells and note how the formulas changed when they were copied. You will understand this example much better if you actually perform the steps instead of just reading about it.

9. Using functions and cell ranges
Calc provides many functions that can be used in formulas. See Help → Contents → Spreadsheets → Function types and Operators → Functions by Category for details. Common ones are SUM, COUNT, SUMIF, COUNTIF, COUNTA, IF, AND, OR, VALUE, TEXT, LEFT, RIGHT, MID, VLOOKUP, and MATCH. Some functions, like SUM or COUNT will accept a range of cells. Ranges are indicated by a colon so that =SUM(A2:A5) gives the sum of the four cells A2 through A5. It is equivalent to typing =SUM(A2;A3;A4;A5), but as the range gets larger it becomes impractical to enter many individual cell names. SUMIF and COUNTIF allow you to sum or count cells that match specified values in other cells. VLOOKUP allow you to locate a value in a table and return matching results. Calc's functions and formulas operate on cell values, so do not expect to perform calculations based on cell formats, say the sum of all of the cells you've formatted as red. You can nest functions, for example =IF(AND(A2>=30;A2<40);A2+B2;50). Note that Calc uses a semicolon, not a comma, to separate operands for a function. For complex tasks it can be easier to perform partial/temporary calculations in unused columns, say ZA, ZB, ZC, … then put the final formula in the main part of the sheet. It is not necessary to use a function in a formula. To multiply two numbers, just use =cellA*cellB instead of =SUM(cellA*cellB). The two values are multiplied by the * operator, and then SUM is not needed because there is only a single number to add. Don't make extra work for yourself by writing =SUM(cellC) when =cellC would do, or =SUM(cellA+cellB) when =cellA+cellB would do.

Note that questions are not allowed in the Tutorials section of the forum. Ask them in the Applications > Calc section. In addition, discussion about corrections or improvements should take place in the Applications > Calc forum where more people will see it and participate. If that discussion creates changes for this tutorial, I will update it. Keeping the discussion separate will make it easier for readers of the tutorial since they will not need to wade through interim dialog. I will ask a moderator to remove anything that you post here.
Locked