Skip to content

Latest commit

 

History

History
110 lines (56 loc) · 19.2 KB

definitions.md

File metadata and controls

110 lines (56 loc) · 19.2 KB

Definitions

Argument: See Function

Array: A simple type of container not actually available in Python, but commonly used in other languages. Arrays are a bit like lists, but (a) often have a fixed number of elements, and (b) normally require all elements to be of the same type. The numpy module, however, is built around it's own array class. These are properly called numpy arrays, but people often just call them arrays. See numpy section for details.

Boolean: A Boolean value (named after the mathematician George Bool) is one that can only have one of two values, normally expressed as True or False.

Breakpoint: See Debugger. A point set in a program to force the debugger to stop running at full speed and enter line-at-a-time mode.

Class: The ‘blueprint’ for an Object. See object for a proper discussion!

Comment: Text inserted into a program for the sole-purpose of human-readability. Comments have no effect on the actual running of the program, but are an essential part of programming nonetheless. Turning statements into comments is also a convenient way to temporarily remove a line from your program, without permanently deleting it. In python, use the # symbol to start a comment.

Comprehension: A Python term for a syntax-construction that enables you to do certain container iteration tasks with very little typing. See syntax section below for list comprehensions. Depending on who you ask, comprehensions are either wonderful labour-saving constructions that make everyone's life easier, or hellish syntactical minefields that wreck code-readability and hamper debugging. I tend to lean towards the latter view!

Condition: Something that a programming language can evaluate to a Boolean value (a True or False), and hence use as the basis of a decision that affects program flow.

Container (or container class, or collections class, or iterable – you can treat these as synonyms): An object comprising multiple elements. The most obvious and useful example in Python is the list class – lists contain a number of elements in some defined order, referred to by their index number. Other containers exist (sets, dictionaries and tuples are the most commonly encountered ones – not dealt with this week – plus an odd list-like one that is returned by the range function – see below)

CSV (comma separated value) file: csv files are the most straightforward way of storing numerical data, and are very widely used. They can be easily generated by and read by spreadsheet software. They consist of data in text format, with each data item separated by a comma. Typically they are used to store tables, so there will be many lines in the text file, each with the same number of comma-separated data items. The first line may or may not be a 'header' line giving the names of the columns. When you encounter text rather than numbers in a csv file, it may or may not be enclosed in either single or double quotes. To check the precise format of a csv file, just open it and look at it. Don't do this in Excel, which will try to do 'clever' translations of the data and not help you understand what it really looks like. Instead, view it in a text-editor like Notepad, or open it within VS Code.

Debugger: A debugger (properly symbolic debugger) is a piece of software that enables a programmer to step through a program line-at-a-time, watching variable values, and following exactly what their program is doing. This is not the only way to debug a program, but is often the easiest way. Debuggers can either be set to set through programs when the user presses a key, or to run the program as normal (full-speed mode) until it reaches a predefined point (a breakpoint). In Spyder,

Dictionary: A Python container (=iterable, =collection) class, in which data is stored as key/value pairs - see syntax section for details of how to use them. Keys are a bit like the indices of lists, in that you use them to refer to particular elements, except that they don't have to be numbers, they don't have to be a continuous sequence, and they don't have to start at 0. Values = elements. Key/value pairs in dictionaries are not stored in any particular order. In other languages, very similar containers to Python's dictionaries may be called 'hash tables', or 'associative arrays'.

Element: an individual item in a container.

Exception: An error that occurs during program execution (we say that the exception is thrown). In Python, exceptions have a type, and you can check for particular types of error using try / except (see syntax section)

File: I hope you all know what a file is, but just in case… a file is a chunk of data stored on a computers permanent storage (hard disk, network drive, etc). Files have a 'type' – a hint to programs about what type of data is stored within, and how it's encoded. File types are indicated by the extension – the text after a '.' in the filename, so myfile.csv has a .csv extension, and is in csv (comma separated value) format. It is very common in scientific programming to want to process a file – read in data from the file, do something with that data, then either print some results or output processed results into a new file that you can then view in some other program (Excel perhaps). Programs that do the latter do not look exciting when running, but are very practical. In this course we will be concentrating on reading/writing .csv files.

Float: a shorthand for Floating Point Number.

Floating point number: The normal method of storing non-integer numbers in a computer. Floating point numbers are stored in a mantissa/exponent format (like 10.43223 x 1012, except they use powers of 2 not 10). You don’t need to understand the details of how they are stored, but there are some important consequences of this that you DO need to understand, that arise because they only are approximations to the real numbers they represent. There may be tiny errors in results of calculations arising from this imprecision. Remember in Session 1 when we divided 10 by 3 and got 3.3333333333335? That 5 arose from floating point inaccuracies. Precision depends on magnitude, so if (for instance) you add 1 to a huge floating point number (10100 for instance) you may find that the result is still 10100 not 10100+1 – the precision available may not be enough to tell the difference. You may also find the occasional result like 7.9999999999 from a calculation where you expected 8. Most critically, avoid EVER using == or != comparisons on floating point numbers. 2.0 + 2.0 == 4.0 may not always evaluate to True (because 2.0 + 2.0 might evaluate to 3.99999999999999, which is not quite 4.0).

Function: a separate piece of code that ‘does something’. Functions usually have to be given (passed) one or more values to work on. These passed values are called arguments. Many functions also pass a value back to the main program than launched (called) them. Some functions ('user defined functions') you will create yourself. Others are built into Python (e.g. the print function). Others are written by third parties, and you have import them. Python function syntax (much the same as in all programming languages) is explained in the Syntax section.

Integer: A whole number. In computing, an integer is a whole number stored in a specific binary format, using a particular number of bits (binary digits). Most languages use a set number of bits for this (32 or 64 typically), which means that integers can ‘overflow’ if they get too large (too large means in the billions at the very least). Python integers never overflow – they expand in size in memory to be as big or as small as they need to be. This makes life easy for us in this course, but if you ever move to another language, remember that integers may have limits.

Integrated Development Environment (IDE): A program/application (same thing) designed to bring all the tools you need for programming into one place, and to make the life of a programmer as easy as possible. Spyder is the IDE we are using for this course. Use of an IDE in programming is optional.

Import: The process of bringing a module, package or library into your program so you can use it – this is a general term used in many languages, though most don’t actually use the keyword ‘import’ to do it (Python does).

iPython: Short for Interactive Python. iPython (or iPython notebooks) is an older name for what has now morphed into Jupyter Notebooks, though it’s not quite that simple - the Jupyter Notebooks project is really a split-off from iPython, which also still exists and is used by some people. You are more likely to come across Jupyter Notebooks though, so I’ve defined it all there. Sometimes iPython can be used as the name of the interactive console, where you see output - Spyder does this.

Iterable: A Python term for anything that can be iterated over – in practice, this is much the same thing as a container.

Iterating: looping. We often talk about 'iterating over a list', which means executing a loop to look at each element in turn.

Jupyter Notebook: Jupyter Notebooks (JNs) are a different way of running python code, interspersing it with text and graphics and enabling you to run bits of it at a time, rather than just running single programs. The idea is that this enables you to better document what you are doing, and easily share this with others, as well as to break your code up into small chunks that you can look at the results of separately. JN files have the .ipynb extension, as Jupyter Notebooks have evolved from the older but very similar iPython concept. JNs are currently very trendy and you probably come across them at some point, but in this course I simply introduce you to them in concept, and I will demonstrate one in class. While there are a lot of advantages to working this way, using JNs does remove some of the conveniences of an IDE like VSCode (such as the debugger), and for this reason I don’t think they are the best environment in which to first learn the language. Once you are familiar with Python though, there is a lot to be said for working in this way, and the switch-over is not hard.
JNs are run within a web-browser, but confusingly that doesn’t always mean they are running the python remotely in the cloud, or that you can run them with a web-browser without setting things up first. The ‘traditional’ way to run them is to set up a Jupyter Notebooks system on your machine (e.g. there is one included in Anaconda), and to start the notebook through that system, running it locally on your computer. Alternatively, there are now cloud-based JN systems, notably through Microsoft Azure, or using Google Colab. These are a bit easier in that there is no set-up work. To create a Google Jupyter Notebook for instance (assuming you have a google account), just go to your Google Drive on a web-browser, click New and then More, then select ‘Google Colaboratory’. One catch with these cloud-based systems though is that if you want to use files, you’ll need to upload those to the cloud, and use the correct google modules in python to access them. Not that hard in practice, but it’s extra faff.

Keyword Argument (commonly abbreviated to kwarg): An alternative syntax for passing arguments to functions or methods. Using kwargs allows you to put arguments in any order, and by providing names for them in your function call, to improve code readability. Not all built-in functions support kwargs, but you can always use them with user-defined functions.

Kwarg: Abbreviation of Keyword Argument

Library: A pre-written set of functions for some particular purpose or set of purposes that you can bring into your program (import). There are many many libraries for Python already installed on your computer.

Linting: Automatic checking of code for ‘style issues’ or errors, underlining them in red. Spyder provides automatic linting - warnings appear as orange exclamation marks to the left of the code. Linting will not only flag up mistakes, but will also flag up ways in which your code isn’t quite laid out according to standards. These might (will!) all seem like very minor issues, but consistently laid out code really is easier for other people to read, so try to follow these rules.

List: A simple type of container class. See syntax section for details of Python lists.

Matplotlib: A very commonly used graphing/plotting/charting package for python, with support for pretty well any type of graph you would ever want to produce. It’s also pretty easy to use, though not without its little quirks. See matplotlib section and also matplotlib functions.

Method: a function provided as part of a class, intended to work on the data of that class. For instance, capitalize() is a method of the Python str class – it returns a version of the string with an initial capital letter.

Module (or package, or library): A set of functions (and other stuff*) that you can import into your program to provide extra functionality. Package and library are not quite synonyms of module, as both terms can be used for groups of related modules as well as single one. In Python, at it’s simplest, a module is simply a .py file in which functions are defined. *The ‘other stuff’ sometimes includes definitions of constants (e.g. math.Pi), and definitions of classes with methods for complex data-types.

Nesting: Putting a program structure inside another program structure. Pretty well anything can be nested in programming, including function calls, if statements, loops etc. Nesting takes two forms in Python. Nesting within expressions - e.g. float(input("hello")) – consists of brackets within brackets (innermost always happens first). Nesting of control structures (if, while etc.) is done using indentation. Most languages encourage this indentation for readability, but in in Python it is mandatory.

Numpy: Short for Numerical Python – a 3rd party module very widely used in numerical and scientific applications – although it’s NOT part of core python, you are likely to use it in a lot of programs. The most important thing it provides is a high-performance array class which supports vectorization – see Array above – but there are many useful functions as well. Numpy is meant to be pronounced to rhyme with ‘pie’, not with ‘pea’. I may ignore this though!

Object: An instance of a class. The distinction between objects and classes in Python can be confusing and is a little blurry because of some esoteric aspects of the language design (if you Google this you will find some people telling you that classes are also objects in Python, which though true is very unhelpful). For the purposes of this course, think of a class as a particular type of data together with some functions designed to work on that data. The functions are properly called methods when they are attached to a class, but they are still functions – they use brackets, have arguments, and return values. The str class in Python is a good example of a class – it stores text (obviously), and provides many methods that do things to that text (capitalize(), upper(), lower(), etc.). An object is a particular instance of that class, so when you say my_string = "hello", you have made an object – an instance of the str class. Classes can thus be thought of as blueprints for objects. See objects under syntax for more on objects in Python.

Pass: See Function

Program: A series of commands to a computer used to perform some particular task, i.e. a piece of software. Application or App mean essentially the same thing. Programs are created as text using a programming language, and then translated by other programs into the machine language that the computer actually understands (you never really see this stage). Note – standard is American spelling here, so program not programme.

Programming language: A standardised set of rules for how you should phrase commands to a computer that make up a program. Python is a programming language – other common ones include C, C++, Java, but there are many hundreds in existence. Some are specialised for particular tasks – others (like Python) are general-purpose.

Python: A general-purpose programming language that we are using in this course.

Recursion: a programming technique in which functions call themselves to implement some algorithm, usually to navigate ‘tree-like’ data. Recursion is not a straightforward technique to use if you are new to programming, but for certain tasks it can be by far the most efficient approach. See recursion section.

Return value: See Function

Set: A Python container (=iterable, =collection) class, less commonly used than lists or dictionaries. Sets are the same things you've come across in maths – crucially, they can only contain one copy of a particular element. Intersection, Union etc. operations are possible. See syntax sections for details how to use them.

Spyder: The Integrated Development Environment used in this module

Statement: A line of a program that does something. Some statements will be function calls (e.g. a print line). Others may assign values to variables, start or end control structures like loops, or more. In python you only have one statement per line, but some other languages complicate this.

String: text (called a string as text is a string of characters).

Syntax Error: A programming error that is detected before the program runs… things like missing brackets or colons, or other unambiguously illegal code.

Tuple: Tuples are ‘lightweight lists’ – they work a lot like lists, except that you can’t change them once they have been created. They are used for, among other things, returning multiple values from functions. Tuples do exist in some other languages, but far from all.

Type: (of variables or other data). The type of information stored, and/or the coding scheme used to store them. Understanding what type your variables are is critical to understanding what is going on in a program. Simple variable types in Python are int (short for Integer), float (short for Floating point number), bool (short for Boolean, i.e. a True/False variable) and str (string).

User Defined Function: A function that you write, as part of your program.

Variable: A variable is best thought of as a box in the computer’s memory which can store some value for you to do something with later. Variables have names which the programmer assigns – these should be chosen to aid readability of the program, i.e. they should give at least a hint as to what the value is used for (e.g. age_in_years, not just a). Variables have a ‘type’ – the sort of value they hold (e.g. a string or a number – it’s actually much more complex than that, but that will do for now). Once you put a value into a variable, it stays there until you put a new value into it.

Vectorization: Vectorization is a programming technique allowing simple operations to be applied to every element of a container. Normal python containers (lists, dictionaries etc.) do not support vectorization, but numpy arrays do. Vectorization is a huge time-saver, both in terms of time spent writing code, and in execution speed.