Class MSNumpress
java.lang.Object
com.compomics.util.experiment.io.mass_spectrometry.mzml.MSNumpress
public class MSNumpress extends Object
MSNumpress.
- Author:
- fgonzalez, jteleman, sperkins
-
Field Summary
Fields Modifier and Type Field Description static String
ACC_NUMPRESS_LINEAR
static String
ACC_NUMPRESS_PIC
static String
ACC_NUMPRESS_SLOF
-
Constructor Summary
Constructors Constructor Description MSNumpress()
-
Method Summary
Modifier and Type Method Description static Double[]
decode(String cvAccession, byte[] data)
Convenience function for decoding binary data encoded by MSNumpress.static double
decodeFixedPoint(byte[] data)
Decode fixed point.static int
decodeLinear(byte[] data, int dataSize, Double[] result)
Decodes data encoded by encodeLinear.static int
decodePic(byte[] data, int dataSize, Double[] result)
Decodes data encoded by encodePic Result vector guaranteedly shorter than twice the data length (in nbr of values).static int
decodeSlof(byte[] data, int dataSize, Double[] result)
Decodes data encoded by encodeSlof.static byte[]
encode(double[] data, String cvAccession)
Encode.static void
encodeFixedPoint(double fixedPoint, byte[] result)
Encode fixed point.static int
encodeInt(long x, byte[] res, int resOffset)
This encoding works on a 4 byte integer, by truncating initial zeros or ones.static int
encodeLinear(double[] data, int dataSize, byte[] result, double fixedPoint)
Encodes the doubles in data by first using a - lossy conversion to a 4 byte 5 decimal fixed point repressentation - storing the residuals from a linear prediction after first to values - encoding by encodeInt (see above) The resulting binary is maximally dataSize * 5 bytes, but much less if the data is reasonably smooth on the first order.static int
encodePic(double[] data, int dataSize, byte[] result)
Encodes ion counts by simply rounding to the nearest 4 byte integer, and compressing each integer with encodeInt.static int
encodeSlof(double[] data, int dataSize, byte[] result, double fixedPoint)
Encodes ion counts by taking the natural logarithm, and storing a fixed point representation of this.static double
optimalLinearFixedPoint(double[] data, int dataSize)
Optimal linear fixed point.static double
optimalSlofFixedPoint(double[] data, int dataSize)
Optimal slof fixed point.
-
Field Details
-
ACC_NUMPRESS_LINEAR
- See Also:
- Constant Field Values
-
ACC_NUMPRESS_PIC
- See Also:
- Constant Field Values
-
ACC_NUMPRESS_SLOF
- See Also:
- Constant Field Values
-
-
Constructor Details
-
MSNumpress
public MSNumpress()
-
-
Method Details
-
encode
Encode.- Parameters:
data
- the datacvAccession
- the accession number- Returns:
- the encoded data
-
decode
Convenience function for decoding binary data encoded by MSNumpress. If the passed cvAccession is one of ACC_NUMPRESS_LINEAR = "MS:1002312" ACC_NUMPRESS_PIC = "MS:1002313" ACC_NUMPRESS_SLOF = "MS:1002314" the corresponding decode function will be called.- Parameters:
cvAccession
- The PSI-MS obo CV accession of the encoded data.data
- array of double to be encoded- Returns:
- The decoded doubles
-
encodeInt
public static int encodeInt(long x, byte[] res, int resOffset)This encoding works on a 4 byte integer, by truncating initial zeros or ones. If the initial (most significant) half byte is 0x0 or 0xf, the number of such halfbytes starting from the most significant is stored in a halfbyte. This initial count is then followed by the rest of the ints halfbytes, in little-endian order. A count halfbyte c of 0 ≤ c ≤ 8 is interpreted as an initial c 0x0 halfbytes 9 ≤ c ≤ 15 is interpreted as an initial (c-8) 0xf halfbytes. Ex: int c rest 0 => 0x8 -1 => 0xf 0xf 23 => 0x6 0x7 0x1- Parameters:
x
- the int to be encodedres
- the byte array were halfbytes are storedresOffset
- position in res were halfbytes are written- Returns:
- the number of resulting halfbytes
-
encodeFixedPoint
public static void encodeFixedPoint(double fixedPoint, byte[] result)Encode fixed point.- Parameters:
fixedPoint
- the fixed pointresult
- the encoded result
-
decodeFixedPoint
public static double decodeFixedPoint(byte[] data)Decode fixed point.- Parameters:
data
- the data- Returns:
- the decoded data
-
optimalLinearFixedPoint
public static double optimalLinearFixedPoint(double[] data, int dataSize)Optimal linear fixed point.- Parameters:
data
- the datadataSize
- the data size- Returns:
- the optimal linear fixed point
-
encodeLinear
public static int encodeLinear(double[] data, int dataSize, byte[] result, double fixedPoint)Encodes the doubles in data by first using a - lossy conversion to a 4 byte 5 decimal fixed point repressentation - storing the residuals from a linear prediction after first to values - encoding by encodeInt (see above) The resulting binary is maximally dataSize * 5 bytes, but much less if the data is reasonably smooth on the first order. This encoding is suitable for typical m/z or retention time binary arrays. For masses above 100 m/z the encoding is accurate to at least 0.1 ppm.- Parameters:
data
- array of double to be encodeddataSize
- number of doubles from data to encoderesult
- array were resulting bytes should be storedfixedPoint
- the scaling factor used for getting the fixed point repr. This is stored in the binary and automatically extracted on decoding.- Returns:
- the number of encoded bytes
-
decodeLinear
Decodes data encoded by encodeLinear. Note that the compression discard any information < 1e-5, so data is only guaranteed to be within +- 5e-6 of the original value. Further, values > ~42000 will also be truncated because of the fixed point representation, so this scheme is strongly discouraged if values above might be above this size. Result vector guaranteedly shorter than twice the data length (in nbr of values) returns the number of doubles read.- Parameters:
data
- array of bytes to be decodeddataSize
- number of bytes from data to decoderesult
- array were resulting doubles should be stored- Returns:
- the number of decoded doubles, or -1 if dataSize < 4 or 4 < dataSize < 8
-
encodePic
public static int encodePic(double[] data, int dataSize, byte[] result)Encodes ion counts by simply rounding to the nearest 4 byte integer, and compressing each integer with encodeInt. The handleable range is therefore 0 -> 4294967294. The resulting binary is maximally dataSize * 5 bytes, but much less if the data is close to 0 on average.- Parameters:
data
- array of doubles to be encodeddataSize
- number of doubles from data to encoderesult
- array were resulting bytes should be stored- Returns:
- the number of encoded bytes
-
decodePic
Decodes data encoded by encodePic Result vector guaranteedly shorter than twice the data length (in nbr of values).- Parameters:
data
- array of bytes to be decoded (need memorycont. repr.)dataSize
- number of bytes from data to decoderesult
- array were resulting doubles should be stored- Returns:
- the number of decoded doubles
-
optimalSlofFixedPoint
public static double optimalSlofFixedPoint(double[] data, int dataSize)Optimal slof fixed point.- Parameters:
data
- the datadataSize
- the data size- Returns:
- the optimal slof fixed point
-
encodeSlof
public static int encodeSlof(double[] data, int dataSize, byte[] result, double fixedPoint)Encodes ion counts by taking the natural logarithm, and storing a fixed point representation of this. This is calculated as unsigned short fp = log(d) * fixedPoint + 0.5- Parameters:
data
- array of doubles to be encodeddataSize
- number of doubles from data to encoderesult
- array were resulting bytes should be storedfixedPoint
- the scaling factor used for getting the fixed point repr. This is stored in the binary and automatically extracted on decoding.- Returns:
- the number of encoded bytes
-
decodeSlof
Decodes data encoded by encodeSlof. result vector length is twice the data length returns the number of doubles read- Parameters:
data
- array of bytes to be decoded (need memorycont. repr.)dataSize
- number of bytes from data to decoderesult
- array were resulting doubles should be stored- Returns:
- the number of decoded doubles
-