What could be a good way to reduce the size of the data?
Here is a simple prediction-scheme for vertex-positions:

Assume you're starting with an initial triangle (v1,v2,v3).
Find a neighbouring triangle which shares an edge with the current triangle.
Now find the centroid of the shared edge and project the unshared vertex (in this case v1) over to the other side.
Instead of storing v4 you just store the delta from the predicted position to the actual vertex.
As these deltas will cover a much smaller range than absolute positions, you can easily quantize them to just a few bits.