“Hello, World!”
A string is an array of characters.
We can assign a string to a variable
my_string = "Hello, World"
A string is often identified by being surrounded by single or double quotation marks
string1 = 'this is a string'
string2 = "this is also a string"
A single quote enclosed string must have its contained single quotes escaped with the \
character. The same applies for double quoted strings with double quotes.
string3 = 'Good example, isn\'t it?'
string4 = "Wow, programming must be \"so fun\"..."
There are many inbuilt string functions
len(<str>)
<str>.lower()
<str>.upper()
<str>.startswith(...)
<str>.endswith(...)
<str>.zfill(...)
Strings can be combined together.
"Hello, " + "World!" # Hello, World!
String variables can also be combined
string5 = "Hello, "
string6 = "World!"
string5 + string6 # Hello, World!
String variables can also be added to strings
string5 + "World!" # Hello, World!
Strings can only be concatenated with other strings!
To combine variables and values that are not strings, we must call the str(...)
method (or similar) on those variables
age = 9
name = "Tony"
name + "is " + age + " years old" # < TypeError >
name + "is " + str(age) + " years old" # Tony is 9 years old
In Python, we count from 0
!
Therefore indexing also starts from zero
string = "ABCDE"
string[0] = "A"
string[1] = "B"
string[2] = "C"
string[3] = "D"
string[4] = "E"
string[5] DOES NOT EXIST
string = "ABCDE"
We can extract part of the string using these indexes
stringVariable[from:to-1]
string[2:4] == "CD"
string[0:3] == "ABC"
If extracting from the start to a certain index, we can omit the 0
stringVariable[:to-1]
string[:3] == "ABC"
If extracting a certain index to the end of the string, we can omit the value after the :
character
stringVariable[3:]
string[3:] == "DE"
We can select a pattern of characters in the string by setting the step
option of our index.
string[from:to-1:step]
string = "ABCDEF"
string[0:5] == 'ABCDE'
string[0:5:1] == 'ABCDE'
string[0:5:2] == 'ACE'
string[0:5:3] == 'AD'
A negative step could even reverse a string!
string[::-1] == 'EDCBA'
Whilst we can access a specific position in a string, we cannot directly modify it
myString = "abcde"
myString[2] == "c"
myString[2] = "C" # ERROR
String concatenation can be performed through string replacement methods and modifiers
myFmtString1 = "Hello, %s!" % "world"
%s
is replaced with a passed string
If multiple format strings are used, the passed values should be passed as a tuple
myFmtString2 = "%s is %d years old" % ("Tony", 9)
%d
is replaced with a passed integer
Strings also have a .format(...)
function, which replace the placeholders in a string with values passed into the function.
myFmtString3 = '{1} is {0} years old'.format(9, "Tony")
# Tony is 9 years old. His favourite colour is blue
The .format(...)
function takes in both positional and keyword values
myFmtString4 = "{} goes {sound}".format("cow", sound="moo")
# Cows go moo
Introduced in Python 3.6, format string literals can also be used to format values.
name = "Tony"
age = "9"
f"{name} is {age} years old"
# Tony is 9 years
f"16 x 9 is {16 * 9}"
# 16 x 9 is 144
Escape characters (control data) can be placed in a string, commonly new-line (\n
) and tab (\t
) characters
myString1 = "Hello World"
myString2 = "Hello\nWorld"
print(myString1)
# Hello World
print(myString2)
# Hello
# World
A string can span multiple lines of code
multilineString1 = """Hello there
this is a
multiline string
"""
They function as normal strings, and can be used with string formatting techniques, as well as as an f-string
adjective = "cool"
multilineString2 = f'''
I think
that's pretty {adjective}!
'''
Raw strings prevent escape characters from being evaluated
normalString = "Hello\nWorld!"
# Hello
# World!
rawString = r"Hello\nWorld!"
# Hello\nWorld!
In Python 2, to use unicode (UTF-8) characters, an encoding header must be put at the start of the file, and the strings needs a u
prepended to the string
-*- coding: utf-8 -*-
# Python 2.X
unicodeString = u"♥"
In Python 3, strings are unicode by default, so nothing needs to be done
# Python 3.X
unicodeString = "♥"
Bytes have a numerical value from 0x00
(0
) to 0xFF
(255
). A byte-string contains only byte characters.
<str>.encode()
is used to convert a string into a bytestring
unicodeString = "♥"
unicodeString.encode() # b'\xe2\x99\xa5'
<bytes>.decode()
is used to convert a bytestring into a string
byteString = b'\xf0\x9f\x98\x8a'
byteString.decode() # 😊