So we’ve seen numbers, but what about text?
'string' is the data type we use to store and use text in Python.
A piece of text is called a string, and you can perform all kinds of operations on a string. Let’s start with the basics first!
A string is a sequence of characters.
In simple terms, a string is a piece of text. Strings are not just a Python thing. It’s a well-known term in computer science and means the same thing in most of the other languages.
A Python string needs quotes around it for it to be recognized as such, like this:
>>> 'Hello, World'
'Hello, World'
Because of the quotes, Python understands this is a sequence of characters and not a command, number, or variable.
And just like with numbers, some of the operators we learned before work on Python strings, too. Try it with the following expressions:
>>> 'a' + 'b'
'ab'
>>> 'ab' * 4
'abababab'
>>> 'a' - 'b'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for -: 'str' and 'str'
What just happened?
We’ve used single quotes, but Python accepts double quotes around a string as well:
>>> "a" + "b"
'ab'
Note that these are not two single quotes next to each other. The character is often found next to the enter key on your keyboard (US keyboard layout) or on the 2 key (UK layout). It might differ for other countries. You must press shift with this key to get a double quote.
As you can see from its answer, the Python REPL prefers single quotes. It looks clearer, and Python tries to be as clear and well-readable as possible. So why does it support both? It’s because it allows you to use strings that contain a quote.
>>> text1 = "It's a dog"
>>> text2 = 'It's a cat'
File "<stdin>", line 1
text2 = 'It's a cat'
^
SyntaxError: invalid syntax
we used double quotes to create text1 and there was no problem with the single quote in the word it’s.
but when we tried to create text2 with single quotes we run into trouble.
Python sees the quote in the word it’s and thinks that it's the end of the string!
The following letter, s, causes a Syntax Error.
A syntax error is a character or string that is placed incorrectly in a command or instruction, and that causes a failure in execution.
In other words, Python doesn’t understand the s because it expects the string to have ended and so it fails with the error.
>>> 'It's a black cat'
File "<stdin>", line 1
'It's a black cat'
^
SyntaxError: invalid syntax
Python points out the exact location of where it encountered the error with the ^ symbol. Python errors tend to be very helpful, so look closely at them. You’ll often be able to pinpoint what’s going wrong.
Even the syntax highlighter on this website gets confused because of the invalid syntax!
Escaping CharactersSo one way to deal with this syntax error is to just use double quotes.
But what if we wanted text that contains both single and double quotes?
He said "It's not possible"
Another way around these sorts of problems is called escaping.
You can escape a special character, like a quote, with a backward slash:
>>> text = 'It\'s a green cat, would you believe it?'
>>> text
"It's a green cat, would you believe it?"
You can also escape double quotes the same way:
>>> text = "He said \"It's not possible\""
>>> text
'He said "It\'s not possible"'
Here, again, you see Python’s preference for single quotes strings. Even though we used double quotes, Python echoes the string back to us using single quotes. It’s still the same string, though; it’s just represented differently. Once you start printing strings to the screen, you’ll see this.
So should you use single or double quotes? which one should you use? It’s simple: always opt for the option where you need the least amount of escapes, because these escapes make your Python strings less readable.
Python also has syntax for creating multiline strings using triple quotes. By this, I mean three double quotes or three single quotes; both work, but I’ll demonstrate with double quotes:
>>> """This text,
... has many lines...
... more than 2."""
'This text,\nhas many lines...\nmore than 2.'
>>> '''This text,
... also has many lines...
... more than 2.'''
'This text,\nhas many lines...\nmore than 2.'
As you can see, Python echos the string back as a regular, single-line string. In this string, you might have noticed the \n characters: this is how Python and many other programming languages escape special characters such as newlines.
The following table lists a couple of the most common escape sequences you will encounter:
Escaped sequence | Description |
---|---|
\n | A newline (Newlines are generated with your return key). Advances to the next |
\r | Carriage return: takes you back to the start of the line, without advancing to the next line |
\t | A tab character |
\\ | The slash character itself: because it is used as the start of escape sequences, we need to escape this character too. Python is quite forgiving if you forget to escape it, though. |
Unix-based operating systems like Linux use \n for a new line, the carriage return is included automatically, while Windows uses \r\n. This has been and will be the cause of many bugs. So if you’re on Windows, you will see a lot of \r\n.
The nice thing about triple quotes is that you can use both single and double quotes within them. So you can use triple quotes to cleanly create strings that contain both single and double quotes without resorting to escaping:
>>> text = '''He's question was: "How can I escape without back slashes?"'''
Strings come with several handy, built-in operations (also referred to as methods) you can execute. I’ll show you only a couple here since I don’t want to divert your attention too much.
In the REPL, you can sometimes use auto-completion. Whether it works or not depends on which installer you used to install Python and which OS you are on. Your best bet is to try!
create a string s, on the next line type its name and hit the TAB key twice:
>>> s = ""
>>> s.
s.capitalize( s.find( s.isdecimal( s.istitle( s.partition( s.rstrip( s.translate(
s.casefold( s.format( s.isdigit( s.isupper( s.replace( s.split( s.upper(
s.center( s.format_map( s.isidentifier( s.join( s.rfind( s.splitlines( s.zfill(
s.count( s.index( s.islower( s.ljust( s.rindex( s.startswith(
s.encode( s.isalnum( s.isnumeric( s.lower( s.rjust( s.strip(
s.endswith( s.isalpha( s.isprintable( s.lstrip( s.rpartition( s.swapcase(
s.expandtabs( s.isascii( s.isspace( s.maketrans( s.rsplit( s.title(
Most IDEs like Pycharm and VS Code also support auto-completion
In case that didn't work, you can run use the dir and print functions to list all the valid operation our string is supporting
>>> s = ''
>>> print(dir(s))
['capitalize', 'casefold', 'center', 'count', 'encode', 'endswith', 'expandtabs', 'find', 'format',
'format_map', 'index', 'isalnum', 'isalpha', 'isascii', 'isdecimal', 'isdigit', 'isidentifier', 'islower',
'isnumeric', 'isprintable', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip',
'maketrans', 'partition', 'removeprefix', 'removesuffix', 'replace', 'rfind', 'rindex', 'rjust',
'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', 'strip', 'swapcase',
'title', 'translate', 'upper', 'zfill']
>>>
There's an awesome table that lists some of the common string methods and gives a short explanation on each of one of them here on w3schools
Clicking on each of the methods in the list will also give you code examples
for further and more detailed information on string operations (with an explanation) check out the official Python manual.
Transforming Text to UPPER case
>>> 'hello'.upper()
'HELLO'
Transforming Text to lower case
>>> 'WORLD'.lower()
'HELLO'
Getting Text Length
>>> len('Hello there')
11
>>> s = 'Jonny English'
>>> len(s)
13
The len() function can be used on many objects in Python, as you’ll learn later on.
Split Text apart
>>> 'I like apple pies'.split(' ')
['I', 'like', 'apple', 'pies']
The split() takes one argument: ' ' (1 space in our case) and uses it as the sequence of characters to split on. The output is a Python list (which we will learn about later) that contains all the separate words.
Splitting whitespace
>>> 'Hello \t\n there,\t\n\n General Kenobi'.split()
['Hello', 'there,', 'General', 'Kenobi']
A common use case is to split on whitespace. The problem is that whitespace can be a lot of things. Common ones are: space characters tabs and newlines and there are many more. To make things even more complicated, whitespace doesn’t mean just one of these characters but can also be a whole sequence of them.
Because this is such a common operation and because it’s hard to get right, Python has a convenient shortcut, calling the split operation without any arguments .split() will split a string on whitespaces correctly.
Text replacement
>>> 'I, am a Jedi!'.replace('Jedi', 'Cookie')
'I, am a Cookie!'
>>> 'Hello =)'.replace('l', '1')
'He11o =)'
>>> 'Get to the chopper!'.replace('chopper', 'choppah')
'Get to the choppah!'
The split() takes one argument: ' ' (1 space in our case) and uses it as the sequence of characters to split on. The output is a Python list (which we will learn about later) that contains all the separate words.
Getting a single character by its position index
>>> s = 'Python'
>>> s[0]
'P'
>>> s[1]
'y'
>>> s[2]
't'
>>> s[3]
'h'
>>> s[4]
'o'
>>> s[5]
'n'
Note that in Python, like in all computer languages, we start counting from 0.
Slicing Text
>>> s = 'Python'
>>> s[0:2]
'Py'
>>> s[2:4]
'th'
>>> s[4:]
'thon'
Slicing in Python works with the slicing operator, which looks like this:string[start:stop:step_size].
the step_size defaults to 1. we can test that by running the same slices as before but with a step_size of 1
>>> s = 'Python'
>>> s[0:2:1]
'Py'
>>> s[2:4:1]
'th'
>>> s[4::1]
'thon'
>>> s = 'Python'
>>> s[0:2:1]
'Py'
>>> s[0:2]
'Py'
>>> s[0:2:]
'Py'
Let's see more realistic usages with the step_size
>>> s[::2]
'Pto'
>>> s[1::2]
'yhn'
s[::2] would slice our string in steps of 2 like this:
Python => Pto
and s[1::2] would skip the first character and slice our string like this:
Python => yhn
Text in Reverse
>>> s = 'Python'
>>> s[::-1]
'nohtyP'
Using what we've just learned with Slicing, in Python we can also slice in reverse.
We traverse the string from the end to the beginning by giving the slicing operator a negative step size of -1
.
By leaving the start and end positions empty, Python assumes we want to slice the entire string.
Chaining operations
>>> 'Hi, Jack'.replace('Jack', 'Lola').upper()
'HI LOLA'
The first operation (replace) results in a string. This string, like all strings, offers us the same operations again. So we can directly call the next operation (upper) on the result. It’s a handy shortcut.
String formatting
>>> price = 40
>>> product = 'Shoes'
>>> f'{product} cost {price}$'
'Shoes cost 40$'
A common pattern is the need to merge text strings or use a variable or expression inside your string. There are several ways to do so, but the modern way is to use f-strings, which is short for formatted strings.
Just like before, I have gathered some basic challenges for you to go over.
Let's try out some of these:
We learned how to deal with strings (text), numbers and storing these values in variables. In the next lesson we will learn how to output all that data to your screen with Python's print() function.
Next: The Print Function