Tor's Programming Toolbox

Table of Contents

1 Coding & Compression

1.1 Delta encoding

Delta encoding is a way of storing or transmitting data in the form of differences (deltas) between sequential data rather than complete files; more generally this is known as data differencing.

1.1.1 When to use

  • Small or constant variation in data

1.1.2 Simple implementation

#[derive(Debug)]
struct DeltaCoding<T> {
    data: Vec<T>,
    prev: T
}

impl DeltaCoding<i32> {
    fn new() -> DeltaCoding<i32> {
        DeltaCoding {
            data: Vec::new(),
            prev: 0
        }
    }

    fn encode(&mut self, new_data: i32) -> &mut DeltaCoding<i32> {
        if self.data.len() < 1 {
            // First data point
            self.data = vec![new_data];
            self.prev = new_data;
        }
        else {
            let delta = new_data - self.prev;
            self.data.push(delta);
            self.prev = new_data;
        }

        self
    }

    fn decode(&self) -> Vec<i32> {
        let mut decoded: Vec<i32> = Vec::new();
        decoded.push(self.data[0]);

        for i in 1..self.data.len() {
            let val = self.data[i] + decoded[i - 1];
            decoded.push(val);
        }

        decoded
    }
}


let xs: Vec<i32> = vec![1, 2, 3, 4, 5, 6, 7];

let mut delta: DeltaCoding<i32> = DeltaCoding::new();

for x in &xs {
    delta.encode(*x);
}
println!("Input:   {:?}", &xs);
println!("Encoded: {:?}", &delta.data);
println!("Decoded: {:?}", delta.decode());

In this case we could then further encode runs of differences, such as [(1,7)] to indicate that we have a sequence of diff 1 7 inputs long. Furthermore, if we know that the variation is small, we can store these using u8 instead of i32 and so on.

A lot of cool things to do be done!