CS100 Lecture 2
Variables I and Arithmetic Types
Contents
- Variable declaration
- Arithmetic types
- Bits and bytes
- Integer types
- Real floating types
- Character types
- Boolean type
Variable declaration
Type of a variable
Every variable in C has a type.
- The type is fully deterministic and cannot be changed.
- The type is known even when the program is not run.
- \(\Leftrightarrow\) The type is known at compile-time.
- \(\Leftrightarrow\) C is statically-typed \({}^{\textcolor{red}{1}}\). \(\Leftrightarrow\) C has a static type system.
- In contrast, Python is dynamically-typed.
Statically-typed vs dynamically-typed
a = 42 # Type of a is int.
a = "hello" # Type of a becomes str.
The type of a variable
- can be changed, and
- is not necessarily known until we run the program.
int a = 42; // Type of a is int.
a = "hello"; // Error! Types mismatch!
The type of a variable
- is explicitly written on declaration, and
- is known at compile-time, and
- cannot be changed.
A type-related error in C is (usually) a compile error: - It stops the compiler. The executable will not be generated.
Declare a variable
To declare a variable, we need to specify its type and name.
Type name;
Example:
int x; // Declares a variable named `x`, whose type is `int`.
double y; // Declares a variable named `y`, whose type is `double`.
We may declare multiple variables of a same type in one declaration statement, separated by ,:
int x, y; // Declares two variables `x` and `y`, both having type `int`.
Declare a variable
A variable declaration can be placed
- inside a function, which declares a local variable, or
- outside of any functions, which declares a global variable.
#include <stdio.h>
int x, y; // global variables
int main(void) {
scanf("%d%d", &x, &y);
printf("%d\n", x + y);
}
#include <stdio.h>
int main(void) {
// local variables in `main`
int x, y;
scanf("%d%d", &x, &y);
printf("%d\n", x + y);
}
Local variables vs global variables
Which one do you prefer?
#include <stdio.h>
int x, y; // global variables
int main(void) {
scanf("%d%d", &x, &y);
printf("%d\n", x + y);
}
#include <stdio.h>
int main(void) {
// local variables in `main`
int x, y;
scanf("%d%d", &x, &y);
printf("%d\n", x + y);
}
What are these variables used for?
#include <stdio.h>
// Other #includes
int x, y; // What are these two variables used for?
int moveSpaceShuttle(SpaceShuttle *shuttle, Coord to, Vehicle *by) {
// 109 lines
}
int makePreparations(Environment *env, Task tasks[], Time time) {
// 73 lines
}
LaunchResult launchSpaceShuttle(SpaceShuttle *shuttle, Task tasks[]) {
// 35 lines
}
// Other 136 functions, 3325 lines in total
int main(void) {
// 120 lines
}
Readability matters
[Best practice] Declare the variable when you first use it!
- If the declaration and use of the variable are too separated, it will become much more difficult to figure out what they are used for as the program goes longer.
[Best practice] Use meaningful names!
- The program would be a mess if polluted with names like
a,b,c,d,x,y,cnt,cnt_2,flag1,flag2,flag3everywhere. - Use meaningful names:
sumOfScore,student_cnt,open_success, ...
Readability is very important. Many students debug day and night simply because their programs are not human-readable.
Use of global variables
One reason for using global variables is to have them shared between functions:
int input;
void work(void) {
printf("%d\n", input);
}
int main(void) {
scanf("%d", &input);
work();
}
void work(void) {
// Error: `input` was not decared
// in this scope.
printf("%d\n", input);
}
int main(void) {
int input;
scanf("%d", &input);
work();
}
\(\Rightarrow\) More about scopes and name lookup in later lectures / recitations.
Initialize a variable
A variable can be initialized on declaration.
int x = 42; // Declares the variable `x` of type `int`,
// and initializes its value to 42.
int a = 0, b, c = 42; // Declares three `int` variables, with `a` initialized
// to 0, `c` initialized to 42, and `b` uninitialized.
This is syntactically different (though seems equivalent) to
int x; // Declares `x`, uninitialized.
x = 42; // Assigns 42 to `x`.
[Best practice] Initialize the variable if possible. Prefer initialization to later assignment.
\(\Rightarrow\) More on initialization in later lectures.
Arithmetic types
Refer to this page for a complete, detailed and standard documentation.
Integer types
Is int equivalent to \(\mathbb Z\)?
- Is there a limitation on the numbers that
intcan represent?
Integer types
Is int equivalent to \(\mathbb Z\)?
- Is there a limitation on the numbers that
intcan represent?
Experiment:
#include <stdio.h>
int main(void) {
int x = 1;
while (1) {
printf("%d\n", x);
x *= 2; // x = x * 2
getchar();
}
}
Bits and bytes
Information is stored in computers in binary.
- \(42_{\text{ten}}=101010_{\text{two}}\).
A bit is either \(0\) or \(1\).
- The binary representation of \(42\) consists of \(6\) bits.
A byte is \(8\) bits \({}^{\textcolor{red}{2}}\) grouped together like \(10001001\).
- At least \(1\) byte is needed to store \(42\).
- At least \(3\) bytes are needed to store \(142857_{\text{ten}}=100010111000001001_{\text{two}}\)
Bits and bytes
A 32-bit number: \(2979269462_{\text{ten}}=10110001100101000000101101010110_{\text{two}}\).
Suppose now we have \(n\) bits.
- How many different values can be represented?
- What is the largest integer that can be represented?
- How do we represent negative numbers? Non-integer values? ...
Bits and bytes
Suppose now we have \(n\) bits.
- How many different values can be represented?
- \(2^n\).
- What is the largest integer that can be represented?
- \(2^n-1=\underbrace{111\dots 1}_{n}{}_{\text{two}}\).
- How do we represent negative numbers? Non-integer values? ...
- There are several different signed number representations, among which two's complement is widely used.
- About floating-point numbers: IEEE754
- Details are not covered in CS100.
Integer types
An integer type in C is either signed or unsigned, and has a width denoting the number of bits that can be used to represent values.
Suppose we have an integer type of \(n\) bits in width.
- If the type is signed \({}^{\textcolor{red}{3}}\), the range of values that can be represented is \(\left[-2^{n-1},2^{n-1}-1\right]\).
- If the type is unsigned, the range of values that can be represented is \(\left[0, 2^n-1\right]\).
Integer types
Integer types
- The keyword
intis optional in types other thanint: - e.g.
short intandshortname the same type. -
e.g.
unsigned intandunsignedname the same type. -
"Unsigned-ness" needs to be written explicitly:
unsigned int,unsigned long, ... - Types without the keyword
unsignedare signed by default: - e.g.
signed intandintname the same type. - e.g.
signed long int,signed long,long intandlongname the same type.
Width of integer types
- A signed type has the same width as its
unsignedcounterpart. - It is also guaranteed that
sizeof(short)\(\leqslant\)sizeof(int)\(\leqslant\)sizeof(long)\(\leqslant\)sizeof(long long). sizeof(T)is the number of bytes thatTholds.
Implementation-defined behaviors
The standard states that the exact width of the integer types is implementation-defined.
- Implementation: The compiler and the standard library.
- An implementation-defined behavior depends on the compiler and the standard library, and is often also related to the hosted environment (e.g. the operating system).
Which one should I use?
int is the most optimal integer type for the platform.
- Use
intfor integer arithmetic by default. - Use
long longif the range ofintis not large enough. - Use smaller types (
short, or evenunsigned char) for memory-saving or other special purposes. - Use
unsignedtypes for special purposes. We will see some in later lectures.
Which one is the real world, the integer types or \(\mathbb Z\)?
Real floating types
"Floating-point": The number's radix point can "float" anywhere to the left, right, or between the significant digits of the number.
Real floating-point types can be used to represent some real values.
- Real floating-point types \(\neq\mathbb R\).
Real floating types
C has three types for representing real floating-point values:
float: single precision. Matches IEEE754 binary32 format if supported.double: double precision. Matches IEEE754 binary64 format if supported.long double: extended precision. A floating-point type whose precision and range are at least as good as those ofdouble.
Details of IEEE754 formats are not required in CS100.
Range of values can be found in this table.
Which one should I use?
Use double for real floating-point arithmetic by default.
- In some cases the precision of
floatis not enough. - Don't worry about efficiency!
doublearithmetic is not necessarily slower thanfloat.
Do not use floating-point types for integer arithmetic!
scanf/printf
Refer to the table in this page.
%fforfloat,%lffordouble, and%Lfforlong double.
Exercise
Write the "A+B" program for real numbers. Which type do you decide to use? How do you read and print the values?
Exercise
Write the "A+B" program for real numbers. Which type do you decide to use? How do you read and print the values?
#include <stdio.h>
int main(void) {
double a, b;
scanf("%lf%lf", &a, &b);
printf("%lf\n", a + b);
return 0;
}
Character types
The C standard provides three different character types: signed char, unsigned char and char.
Let T \(\in\{\)signed char, unsigned char, char\(\}\). It is guaranteed that
1 == sizeof(T) <= sizeof(short) <= sizeof(int) <= sizeof(long) <= sizeof(long long).
- T takes exactly 1 byte.
Question: What is the valid range of signed char? unsigned char?
Character types
Question: What is the valid range of signed char? unsigned char?
signed char: \([-128, 127]\).unsigned char: \([0, 255]\).
What? A character is an integer?
ASCII (American Standard Code for Information Interchange)
A character is represented in computers as its ASCII code, which is a small integer.
ASCII (American Standard Code for Information Interchange)
A character is represented in computers as its ASCII code, which is a small integer.
- We only consider the so-called ASCII characters here.
A character is nothing but an integer! In C, there is no "conversion" between characters and ASCII code!
ASCII (American Standard Code for Information Interchange)
Important things to remember:
- \([\)
'0'\(,\)'9'\(]=[48, 57]\). - \([\)
'A'\(,\)'Z'\(]=[65, 90]\). - \([\)
'a'\(,\)'z'\(]=[97, 122]\).
Example: Given a lowercase letter, return its uppercase form.
char to_uppercase(char x) {
return x - 32;
}
[Best practice] Avoid magic numbers
What is the meaning of 32 here? \(\Rightarrow\) a magic number.
char to_uppercase(char x) {
return x - 32;
}
Write it in a more human-readable way:
char to_uppercase(char x) {
return x - ('a' - 'A');
}
Escape sequence
Some special characters are not directly representable: newline, tab, quote, ...
We use escape sequences, e.g.
Character types
char, signed char and unsigned char are three different types.
- Whether
charis signed or unsigned is implementation-defined. - If
charis signed (unsigned), it represents the same set of values as the typesigned char(unsigned char), but they are not the same type. - In contrast,
Tandsigned Tare the same type forT\(\in\{\)short,int,long,long long\(\}\).
Character types
For almost all cases, use char (or, sometimes int) to represent characters.
signed char and unsigned char are used for other purposes.
To read/print a char using scanf/printf, use %c.
Sad story: Handling non-ASCII characters? ...
Sad story: Handling non-ASCII characters? ...
Even though the standard provides wchar_t, char8_t (since C23), char16_t and char32_t to handle wide/Unicode characters, there are still a lot of problems.
C++23 has some improvement.
That's why Python people laugh at us ...
Boolean type: bool (since C99)
A type that represents true/false, 0/1, yes/no, ...
To access the name bool, true and false, <stdbool.h> is needed. (until C23)
Example: Define a function that accepts a character and returns whether that character is a lowercase letter.
int is_lowercase(char c) {
if (c >= 'a' && c <= 'z')
return 1;
else
return 0;
}
bool is_lowercase(char c) {
if (c >= 'a' && c <= 'z')
return true;
else
return false;
}
Boolean type: bool (since C99)
int is_lowercase(char c) {
if (c >= 'a' && c <= 'z')
return 1;
else
return 0;
}
bool is_lowercase(char c) {
if (c >= 'a' && c <= 'z')
return true;
else
return false;
}
Both return values can be used as follows:
char c; scanf("%c", &c);
if (is_lowercase(c)) {
// do something when c is lowercase ...
}
[Best practice] Simplify your code
Just return the result of the condition expression.
int is_lowercase(char c) {
return c >= 'a' && c <= 'z';
}
bool is_lowercase(char c) {
return c >= 'a' && c <= 'z';
}
We will introduce the operators (&&, <=, >=) involved here in later lectures.
Summary
- Variable declaration
- Type + name
- Multiple variables in one declaration statement
- Global vs local
- Initialization
Summary
- Arithmetic types
Summary
- Arithmetic types
- Width, signed-ness, valid range
- Which type to choose
- Characters: ASCII code, escape sequence
- Boolean
Exercise
Write a simple calculator that handles input of the form x op y, where x and y are floating-point numbers and op \(\in\{\) '+', '-', '*', '/' \(\}\). You may use a group of if-else statements like this:
if (op == '+') {
// ...
} else if (op == '-') {
// ...
} else if (op == '*') {
// ...
} else if (op == '/') {
// ...
} else {
// report an error
}
Notes
\({}^{\textcolor{red}{1}}\) The type of every expression in C is determined at compile-time except for variable-length arrays (since C99).
\({}^{\textcolor{red}{2}}\) A byte is 8 bits on most platforms, but we do have exceptions: 36-bit computing.
\({}^{\textcolor{red}{3}}\) There are several different signed number representations, but all popular machines and almost all compilers use two's complement. Before C23 and C++20, the C/C++ standards allow for all possible representations, so the minimal valid range for a \(n\)-bit integer is \(\left[-2^{n-1}+1,2^{n-1}-1\right]\), which is the range for one's complement and sign-and-magnitude. Since C23 and C++20, the only representation allowed is two's complement, so the valid range is guaranteed to be \(\left[-2^{n-1},2^{n-1}-1\right]\). In CS100 we still assume that two's complement is used, even though we are based on C17/C++17.