« Oatmeal

Tagged "c"

Follow this tag

Data Types and Variables in C

I’ve been writing a heap of Lua lately — this has lead to my becoming interested, again, in C. Here are some ancient notes I dug up on the most basics of data types and variables in C.

All of a computer’s memory is comprised of bits. A sequence of 8 bits forms a byte. A group of bytes (typically 4 or 8) form a word. Each word is associated with a memory address. The address increases by 1 with each byte of memory.

In C, a byte is an object that is as big as the smallest addressable unit.

Bytes are the minimum addressable, 8 bit wide unit.

A variable is a container for data. A variable is a symbolic representation of a memory location, or address.

A variable is comprised of a few parts:

First, define the data type, then an identifier, and then, optionally, initialize the variable with some data.

int number_of_bananas = 124;

Here, int is the data type, number_of_bananas is the identifier and 124 is the data.

C is strongly typed, this means that the data type cannot be changed after it is declared. You can make the value immutable by turning it into a constant using the const keyword.

const int number_of_bananas = 124;

A data type is a collection of compile-time properties, including:

  • memory size and alignment
  • set of valid values
  • set of permitted operations

Some data types available in C include,

  • Numbers (int, float, hex, etc.)
  • Characters
  • Strings
  • Array
  • Complex data types, like structs and pointers

Numbers and characters are called fundamental data types” in C — all other data types are called derived data types” because they are derived from the fundamental types.

Integers, int, are any non-fractional numbers either negative or positive including 0. You would use an int to describe the number of pets you have — you cannot have a fractional number of pets…unless you’ve done something awful and/or are cosplaying as King Solomon.

ints come in both signed and unsigned flavors. An int can be negative or positive, while an unsigned int can be 0 or positive, never negative. unsigned ints are useful for when you need to express a very large positive value. So, if you were going to create a variable to represent the temperature in Fahrenheit, you would want to use an int since the temperature in Fahrenheit can be negative, positive or exactly 0. While, if you were going to create a variable to represent the temperature in Kelvin you would probably want to use an unsigned int since Kelvin starts at 0 and only goes up from 0.

You can define the unsigned int data type using the keyword unsigned int or just unsigned.

Beside coming in signed and unsigned variants, int also comes in different sizes –

  • short int
  • int
  • long int
  • long long int

These describe different byte sizes allotted to the value. These exist in unsigned variants, too. See stdint.h for waaaaay more on this.

Totally random aside! When displaying a variable you need to use the correct format specifier, so, if a plain ol’ int %d whereas if a long int %ld. Now, if you wanna format the number a bit more, you can also include a width to help pad the number, e.g. %7d will add 6 leading spaces before the number if it is 1 digit long, or 5 if the number is 2 digits long.

int the_number = 42;
printf("The Answer to life, the universe and everything is %7d\n", the_number);

// The Answer to life, the universe and everything is      42

floats and doubles can also include a number in their format string that defines their precision.

So, with 2 points of precision:

float pi = 3.14;
printf("%12.2f | PI\n", pi);
double pi2 = 314E-2;
printf("%12.2e | PI\n", pi2);

//      3.14 | PI
//  3.14e+00 | PI

Or with 4!

float pi = 3.14;
printf("%12.4f | PI\n", pi);
double pi2 = 314E-2;
printf("%12.4e | PI\n", pi2);

//  3.140000 | PI
//3.1400e+00 | PI

While ints represent discrete values floating point numbers (floats) are used to represent any number, negative or positive, including 0 and decimals, e.g. 3.14 is a float. This can also be written as 314E-2 as a double.

float pi = 3.14;
double pi2 = 314E-2;

char variables are represented numerically by an 8 bit signed integer (1 byte). This means that the available numeric range for char is from -128 to 127. This is the range of the ASCII table. BOOM, or from Wikipedia.

While char ranges from -128 to 127 unsigned char ranges from 0 to 255.

A boolean is a variable that can only take 1 of 2 values. Either true or false. C originally didn’t have any booleans, instead false was assumed to be 0 and anything other than 0 was considered to be true. While modern C supports boolean data types, treating 0 as false remains a common idiom. The boolean data type was introduced in the C99 standard.

An enumeration, or enum is a list of constants. It is useful for when you want to select exactly 1 option from a list of predefined values. Behind the scenes, enums are nothing more than numbers…this makes sense for a data type called an enumeration. enums return an index, not an identifier, e.g. 

enum Menu {
    COFFEE,  // 0
    JUICE,   // 1
    WAFFLES, // 2

enum menu order = JUICE;

printf("The order: %d\n", order);

// The order: 1

If you want to explicitly set an index on an enum option you can. Note that the numbers of the options picks up from whatever you defined.

enum months
    JAN = 1,

The above ensure that the months are numbered in a sane way…not starting from 0.

In reply to: chibicc: A Small C Compiler

Each commit of this project corresponds to a section of the book. For this purpose, not only the final state of the project but each commit was carefully written with readability in mind. Readers should be able to learn how a C language feature can be implemented just by reading one or a few commits of this project.

In reply to: The Ferret Lisp System | Irreal

The source of the whole system is a single Org mode file. If you had any doubts as to whether Org could support a literate programming approach in a non-trivial project