File Size Calculations

binary-dataDigital data consists of binary information and is stored as a collection of 0’s and 1’s. On a computer system, numbers, text, pictures, sound files, video clips and computer programs are all stored using binary code.

Storing text files in binary

Text files are stored using a character set such as ASCII code or UNICODE. The number of bits used to encode one character has an impact on the total number of characters included in the character set.
For instance:

  • ASCII code uses 7 bits per characters and contains 128 codes/characters.
  • Extended ASCII code uses 8 bits per characters and contains 256 codes/characters.
  • UNICODE uses either 2 Bytes (UTF-16) or 4 Bytes (UTF-32) per character and contains either 65,536 or 4,294,967,296 characters, enough to include all the characters and symbols used in every language worldwide.

Based on this information, we can easily work out the formula used to estimate the size of a text file as follows:

Text File Size = number of bits per character x number of characters

Text File Size Estimation

Storing bitmap pictures in binary

A bitmap picture is a 2D grid of pixels of different colours. You can read more about how bitmap pictures are stored in binary on this post.

Two criteria will impact the file size of a bitmap picture:

  • The resolution: The number of pixels it contains which can be defined as: width in pixels x height in pixels. For instance a picture of 640 by 480 pixels would contain 640 x 480 = 307,200 pixels.
  • The colour depth: The number of bits used to encode the colour of one pixel. For instance a 1-bit colour depth means that the graphic can only include 2 colours (e.g. 1 = black, 0 = white), and 8-bit colour depth means that the graphic can include up to 256 colours, and a 3-Byte colour depth (RGB code) would include 16,777,216 colours.

Based on this information, we can easily work out the formula used to estimate the size of a bitmap picture as follows:

Picture File Size = colour depth x width in pixels x height in pixels

Picture File Size Estimation

Note that a bitmap picture would also include a few more Bytes of data to store the Meta Data which contain additional information used by the computer to render the graphic such as the width of the graphic in pixels, its height in pixels and its colour depth. We will however ignore this in our file size estimation as for large graphics this would only make a small difference to the file size estimation.

Storing sound files in binary

An analogue sound wave can be digitalised using a process called sound sampling. You can find out more about sound sampling on this post.

Three criteria will impact the file size of a sound file:

  • The sample rate: The sample rate correspond to the number of samples being recorded per second. For instance a phone call would have a sample rate of 8kHz (8,000 samples per second) whereas an audio CD would record music with a sample rate of 44.1kHZ (44,000 samples per second) resulting in a higher quality sound.
  • The bit depth: The bit depths correspond to the number of bits used to record one sample. For instance retro-arcade games used to use 8-bit music. Old mobile phones used to use 16-bit ringtones. Higher quality sound files may use a 32-bit bit-depth or higher.
  • The duration: The duration of a the sound files in seconds will impact on the number of samples needed to record the sound file and hence it will have an impact on the file size.

Based on this information, we can easily work out the formula used to estimate the size of a sound file as follows:

Sound File Size = sample rate x duration x bit depth

Mono-Sound File Size Estimation

Note that the above formula is used to estimate the file size of a mono sound file. some sound files use multiple channels such as stereo files (2 channels) or Dolby-surround sound files (6 channels). To estimate their file size, you need to multiply the above formula by the number of channels.

Sound File Size = sample rate x duration x bit depth x number of channels

Sound File Size Estimation

Also, similar to picture files, a sound file would also include some meta-data (sample rate, bit depth, number of channels) needed for the computer to interpret the data, however we will once again ignore this data in our file size estimation.

Programming Task

Your task is to write three procedures used to estimate the file size of text files, bitmap pictures and sound files as follows:

  • estimateTextFileSize() will take two parameters, the number of bits per character and the number of characters in the file. It will output the estimated file size using the formula provided earlier in this post.
  • estimatePictureFileSize() will take three parameters, the width and height of the picture in pixels and its colour depth. It will output the estimated file size using the formula provided earlier in this post.
  • estimateSoundFileSize() will take four parameters, the sample rate (in Hz), the bit depth, the duration (in seconds) and the number of channels. It will output the estimated file size using the formula provided earlier in this post.

Note that for all three procedures, the output information should be displayed using the most suitable unit (bits, Bytes, KB, MB or GB)

Python Code

Complete your code below:

Test Plan


All done? It’s now time to test your code to see if it works as expected.

Test # Type of file Input Values Expected Output Actual Output
#1 Text File Number of bits per character: 8 bits (Extended ASCII)
Number of characters: 3,000
File Size: 3KB (or 2.93KB)
#2 Text File Number of bits per character: 16 bits (Unicode UTF-16)
Number of characters: 12,000
File Size: 24KB (or 23.44KB)
#3 Picture File Width: 640 pixels
Height: 480 pixels
Colour depth: 8 bits
File Size: 307.2KB (or 300KB)
#4 Picture File Width: 1920 pixels
Height: 1080 pixels
Colour depth: 24 bits
File Size: 6.22MB (or 5.93MB)
#5 Sound File (Mobile phone ring tone) Sample Rate: 8 KHZ (=8,000 Hz)
Bit Depth: 16-bits per sample
Duration: 30 seconds
Channel: 1 (mono)
File Size: 48KB (or 468.75KB)
#6 Sound File (uncompressed audio CD track) Sample Rate: 44.1 KHZ
Bit Depth: 16-bits per sample
Duration: 210 seconds
Channel: 2 (stereo)
File Size: 37.04MB (or 35.33MB)

Note that this test plan gives you two possible outputs for each test depending on whether your calculations are based on 1KB = 1,000 Bytes or 1KB=1,024 Bytes. Both approaches are acceptable.

Extension Task 1: Animated Gif File

Animated Gif files consists of a collection of bitmap pictures that are displayed one at a time over a few seconds. Most animated gif files loop back to the first picture (frame) after reaching the last frame. The frame rate of a gif file defines the number of frames per second.

We can calculate the size of an animated gif files as follows:

Animated Gif File Size = width x height x colour depth x sample rate x duration

Animated Gif File Size Estimation

Your task is to create an extra function called estimateAnimatedGifFileSize() that will take five parameters, the width and height of the pictures in pixels, their colour depth, the frame rate in fps (frame per seconds) and the duration of the animation in seconds. It will output the estimated file size using the above formula.

You can then test your subroutine using the following input data:

Test # Type of file Input Values Expected Output Actual Output
#1 Animated Gif File Width: 150 pixels
Height: 150 pixels
Colour depth: 4 bits
Frame Rate: 4 fps
Duration: 6 seconds.
File Size: 270KB (or 263.67KB)

Extension Task 2: Movie Files

Movie files are similar to animated gif. A movie clip also consists of a collection of still pictures displayed with a high frame rate e.g. 24 fps (frames per seconds). Movie clips also include a soundtrack that also need to be included in the estimation of the overall file size of a movie clip.

You can then test your subroutine using the following input data:

Test # Type of file Input Values Expected Output Actual Output
#1 Uncompressed Movie File Width: 1920 pixels
Height: 1080 pixels
Colour depth: 24 bits
Frame Rate: 24 fps
Duration: 1 hour 15 minutes

Soundtrack:
Sample Rate: 48 KHZ
Bit Depth: 16-bits per sample
Duration: 1 hour 15 minutes.
Channel: 6 (Dolby-Surround)

File Size: 672GB (or 640GB) File Size: 6.22MB (or 5.93MB)

Compression Algorithms

Note that these calculations are based on estimating file size of uncompressed files. Compression algorithms are often applied to picture files, sound files and movie files to reduce their overall file size.

For instance .png or .jpg picture files, .mp3 sound files or .mp4 movie files are all compressed files so their file size would be smaller than the file size given by the above calculations.

Share Button
Tagged with: