java.lang.Object
com.compomics.util.experiment.io.mass_spectrometry.mzml.MSNumpress

public class MSNumpress
extends Object
MSNumpress.
Author:
fgonzalez, jteleman, sperkins
  • Field Summary

    Fields 
    Modifier and Type Field Description
    static String ACC_NUMPRESS_LINEAR  
    static String ACC_NUMPRESS_PIC  
    static String ACC_NUMPRESS_SLOF  
  • Constructor Summary

    Constructors 
    Constructor Description
    MSNumpress()  
  • Method Summary

    Modifier and Type Method Description
    static Double[] decode​(String cvAccession, byte[] data)
    Convenience function for decoding binary data encoded by MSNumpress.
    static double decodeFixedPoint​(byte[] data)
    Decode fixed point.
    static int decodeLinear​(byte[] data, int dataSize, Double[] result)
    Decodes data encoded by encodeLinear.
    static int decodePic​(byte[] data, int dataSize, Double[] result)
    Decodes data encoded by encodePic Result vector guaranteedly shorter than twice the data length (in nbr of values).
    static int decodeSlof​(byte[] data, int dataSize, Double[] result)
    Decodes data encoded by encodeSlof.
    static byte[] encode​(double[] data, String cvAccession)
    Encode.
    static void encodeFixedPoint​(double fixedPoint, byte[] result)
    Encode fixed point.
    static int encodeInt​(long x, byte[] res, int resOffset)
    This encoding works on a 4 byte integer, by truncating initial zeros or ones.
    static int encodeLinear​(double[] data, int dataSize, byte[] result, double fixedPoint)
    Encodes the doubles in data by first using a - lossy conversion to a 4 byte 5 decimal fixed point repressentation - storing the residuals from a linear prediction after first to values - encoding by encodeInt (see above) The resulting binary is maximally dataSize * 5 bytes, but much less if the data is reasonably smooth on the first order.
    static int encodePic​(double[] data, int dataSize, byte[] result)
    Encodes ion counts by simply rounding to the nearest 4 byte integer, and compressing each integer with encodeInt.
    static int encodeSlof​(double[] data, int dataSize, byte[] result, double fixedPoint)
    Encodes ion counts by taking the natural logarithm, and storing a fixed point representation of this.
    static double optimalLinearFixedPoint​(double[] data, int dataSize)
    Optimal linear fixed point.
    static double optimalSlofFixedPoint​(double[] data, int dataSize)
    Optimal slof fixed point.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

  • Constructor Details

  • Method Details

    • encode

      public static byte[] encode​(double[] data, String cvAccession)
      Encode.
      Parameters:
      data - the data
      cvAccession - the accession number
      Returns:
      the encoded data
    • decode

      public static Double[] decode​(String cvAccession, byte[] data)
      Convenience function for decoding binary data encoded by MSNumpress. If the passed cvAccession is one of ACC_NUMPRESS_LINEAR = "MS:1002312" ACC_NUMPRESS_PIC = "MS:1002313" ACC_NUMPRESS_SLOF = "MS:1002314" the corresponding decode function will be called.
      Parameters:
      cvAccession - The PSI-MS obo CV accession of the encoded data.
      data - array of double to be encoded
      Returns:
      The decoded doubles
    • encodeInt

      public static int encodeInt​(long x, byte[] res, int resOffset)
      This encoding works on a 4 byte integer, by truncating initial zeros or ones. If the initial (most significant) half byte is 0x0 or 0xf, the number of such halfbytes starting from the most significant is stored in a halfbyte. This initial count is then followed by the rest of the ints halfbytes, in little-endian order. A count halfbyte c of 0 ≤ c ≤ 8 is interpreted as an initial c 0x0 halfbytes 9 ≤ c ≤ 15 is interpreted as an initial (c-8) 0xf halfbytes. Ex: int c rest 0 => 0x8 -1 => 0xf 0xf 23 => 0x6 0x7 0x1
      Parameters:
      x - the int to be encoded
      res - the byte array were halfbytes are stored
      resOffset - position in res were halfbytes are written
      Returns:
      the number of resulting halfbytes
    • encodeFixedPoint

      public static void encodeFixedPoint​(double fixedPoint, byte[] result)
      Encode fixed point.
      Parameters:
      fixedPoint - the fixed point
      result - the encoded result
    • decodeFixedPoint

      public static double decodeFixedPoint​(byte[] data)
      Decode fixed point.
      Parameters:
      data - the data
      Returns:
      the decoded data
    • optimalLinearFixedPoint

      public static double optimalLinearFixedPoint​(double[] data, int dataSize)
      Optimal linear fixed point.
      Parameters:
      data - the data
      dataSize - the data size
      Returns:
      the optimal linear fixed point
    • encodeLinear

      public static int encodeLinear​(double[] data, int dataSize, byte[] result, double fixedPoint)
      Encodes the doubles in data by first using a - lossy conversion to a 4 byte 5 decimal fixed point repressentation - storing the residuals from a linear prediction after first to values - encoding by encodeInt (see above) The resulting binary is maximally dataSize * 5 bytes, but much less if the data is reasonably smooth on the first order. This encoding is suitable for typical m/z or retention time binary arrays. For masses above 100 m/z the encoding is accurate to at least 0.1 ppm.
      Parameters:
      data - array of double to be encoded
      dataSize - number of doubles from data to encode
      result - array were resulting bytes should be stored
      fixedPoint - the scaling factor used for getting the fixed point repr. This is stored in the binary and automatically extracted on decoding.
      Returns:
      the number of encoded bytes
    • decodeLinear

      public static int decodeLinear​(byte[] data, int dataSize, Double[] result)
      Decodes data encoded by encodeLinear. Note that the compression discard any information < 1e-5, so data is only guaranteed to be within +- 5e-6 of the original value. Further, values > ~42000 will also be truncated because of the fixed point representation, so this scheme is strongly discouraged if values above might be above this size. Result vector guaranteedly shorter than twice the data length (in nbr of values) returns the number of doubles read.
      Parameters:
      data - array of bytes to be decoded
      dataSize - number of bytes from data to decode
      result - array were resulting doubles should be stored
      Returns:
      the number of decoded doubles, or -1 if dataSize < 4 or 4 < dataSize < 8
    • encodePic

      public static int encodePic​(double[] data, int dataSize, byte[] result)
      Encodes ion counts by simply rounding to the nearest 4 byte integer, and compressing each integer with encodeInt. The handleable range is therefore 0 -> 4294967294. The resulting binary is maximally dataSize * 5 bytes, but much less if the data is close to 0 on average.
      Parameters:
      data - array of doubles to be encoded
      dataSize - number of doubles from data to encode
      result - array were resulting bytes should be stored
      Returns:
      the number of encoded bytes
    • decodePic

      public static int decodePic​(byte[] data, int dataSize, Double[] result)
      Decodes data encoded by encodePic Result vector guaranteedly shorter than twice the data length (in nbr of values).
      Parameters:
      data - array of bytes to be decoded (need memorycont. repr.)
      dataSize - number of bytes from data to decode
      result - array were resulting doubles should be stored
      Returns:
      the number of decoded doubles
    • optimalSlofFixedPoint

      public static double optimalSlofFixedPoint​(double[] data, int dataSize)
      Optimal slof fixed point.
      Parameters:
      data - the data
      dataSize - the data size
      Returns:
      the optimal slof fixed point
    • encodeSlof

      public static int encodeSlof​(double[] data, int dataSize, byte[] result, double fixedPoint)
      Encodes ion counts by taking the natural logarithm, and storing a fixed point representation of this. This is calculated as unsigned short fp = log(d) * fixedPoint + 0.5
      Parameters:
      data - array of doubles to be encoded
      dataSize - number of doubles from data to encode
      result - array were resulting bytes should be stored
      fixedPoint - the scaling factor used for getting the fixed point repr. This is stored in the binary and automatically extracted on decoding.
      Returns:
      the number of encoded bytes
    • decodeSlof

      public static int decodeSlof​(byte[] data, int dataSize, Double[] result)
      Decodes data encoded by encodeSlof. result vector length is twice the data length returns the number of doubles read
      Parameters:
      data - array of bytes to be decoded (need memorycont. repr.)
      dataSize - number of bytes from data to decode
      result - array were resulting doubles should be stored
      Returns:
      the number of decoded doubles