Analysis File Formats
Kyma can resynthesize sounds from several different types of analyses: SOS (Sum-of-Sines or additive synthesis), GA (group additive synthesis), and RE (resonator-excitation synthesis). The analysis information is stored in Apple Audio Interchange File Format files.
AIFF Header
SOS Analysis Files
GA Analysis Files
RE Analysis Files
The AIFF Header
A complete description of the Apple Audio Interchange File Format can be found in Inside Macintosh: Sound, published by Addison-Wesley. The information given here is the minimum necessary to produce analysis files for Kyma.
AIFF files on the Macintosh have a file type of 'AIFF'; on Windows, the file extension is '.aif'. The file itself is broken down into chunks, each chunk contains data about the entire file.
Every AIFF file consists of a single 'FORM' chunk. Inside that chunk are sub-chunks that contain the header information as well as the sample data. The minimal AIFF header consists of the main 'FORM' chunk, plus the 'COMM' chunk and 'SSND' chunk. Kyma adds three distinct application specific chunks to encode additional information for the analysis file. After the initial 'FORM' chunk, the remaining chunks can be presented in any order. Note that all multi-byte entities (16 and 32 bit words) are stored big-endian (most significant byte first) independent of the endian-ness of the host platform.
FORM Chunk
- 00-03 'FORM'
Start of 'FORM' chunk.
- 04-07 ckSize
Size of 'FORM' chunk (== file_size_in_bytes - 8)
- 08-11 'AIFF'
Indicates type of 'FORM' (in this case audio data)
COMMON Chunk
- 12-15 'COMM'
Start of 'COMM' chunk.
- 16-19 ckSize
Size of 'COMM' chunk in bytes (== 18)
- 20-21 channels
- Indicates number of audio channels
- 22-25 frames
- Indicates number of samples in file
- 26-27 sampleBits
- Indicates number of bits per sample (8, 16 or 24
)
- 28-37 SR
- Sample rate as SANE 80-bit extended float
- (use 0x400EAC44000000000000
for 44100 hz)
SOUND DATA Chunk
- 38-41 'SSND'
Start of 'SSND' chunk.
- 42-45 ckSize
Size of 'SSND' chunk
- (== channels * frames * sampleBits / 8 + 8)
- 46-49 offset
Offset to first byte of sample data that follows (== 0)
- 50-53 blockSize
- Block size of data that follows (== 0
)
- 54-+ samples
Sample data
SOS Analysis Files
SOS analysis files contain the amplitude and frequency envelopes of the individual sine waves used in the Sum-of-Sines synthesis. SOS is the format developed in conjunction with Lippold Haken of the CERL Sound Group.
The SOS analysis file is organized by frames. A frame contains the values of the amplitude and frequency envelopes at a specific point in time. The envelope data are arranged in increasing partial order (which usually corresponds to increasing frequency).
The frames are stored in the 'SSND' chunk of the AIFF file; the AIFF file must be one channel (monophonic) and 24 bits per sample. An application specific chunk encodes the number of partials per frame and the duration of each frame.
Within each frame, the amplitude and frequency values for each partial are combined to form a single 24 bit number.
The top signed byte encodes the log of the amplitude value as:
log2 ( amp ) * 127 / 15 + 127
giving a range of approximately 90 dB in increments of 0.711 dB. If the amplitude envelope is zero, the value of the byte should be zero. Only positive values are permitted, so be sure to clip the value to the interval of 0 to 127.
The bottom unsigned word encodes the log of the frequency value as:
log2 ( 2 * freq / SR ) * 65536 / 15 + 65536
giving a range up to the Nyquist limit in increments of 0.275 cents. Note that SR is the same value as stored in the 'COMM" chunk.
For example, to encode a frequency of 1000 hz at an amplitude of 0.75, you would obtain an amplitude encoding of 123 (0x7B) and a frequency encoding of 46038 (0xB3D6), assuming a sample rate of 44100 hz. These two values are combined to give the total encoding of 0x7BB3D6.
SOS APPL Chunk
- 00-03 'APPL'
Start of 'APPL' chunk.
- 04-07 ckSize
Size of 'APPL' chunk (== numberPartials * 4 + 16)
- 08-11 'SOSe'
Indicates SOS envelopes information.
- 12-15 ignored
Ignored. Write as zero.
- 16-19 numberPartials
Number of partials per frame.
- 20-? reserved
Reserved. Write one 32-bit word of zero per partial.
- ?-?? frameDuration
Duration of each frame, in microseconds (32-bit word).
GA Analysis Files
GA analysis files contain the complex waveforms, their corresponding amplitude envelopes, and an overall frequency deviation envelope.
The data are stored in the 'SSND' chunk of the AIFF file; the AIFF file must be one channel (monophonic). An application specific chunk encodes the number of complex waveforms.
The GA analysis file contains, in order: one period of each complex waveform (4096 sample points each), followed by each amplitude envelope, followed by the overall frequency deviation envelope. Each envelope must be the same length, and fewer than 2048 sample points long, typically sampled at 100 hz. The frequency deviation envelope encodes the (relative) frequency envelope:
freq / baseFreq - 1
giving a range between 0 hz and one octave above the base frequency of the analysis.
GA APPL Chunk
- 00-03 'APPL'
Start of 'APPL' chunk.
- 04-07 ckSize
Size of 'APPL' chunk (== 8)
- 08-11 'gaga'
Indicates GA analysis information.
- 12-15 waveformCount
Number of complex waveforms in the analysis file.
RE Analysis Files
RE analysis files contain the coefficients for a time-varying resonant filter. The coefficients are stored in the 'SSND' chunk of the AIFF file; the AIFF file must be one channel (monophonic). We recommend using 24-bit samples to maintain as much accuracy as possible in the filter coefficients. An application specific chunk encodes the number of coefficients and other information.
The RE analysis file is organized by frames. A frame contains the scaled coefficients of the time-varying resonant filter at one point in time. A frame must contain a power-of-two number of coefficients; within the frame the coefficients are in increasing delay order, with the zero delay coefficient omitted.
The coefficients are stored as fixed point numbers. To encode the coefficients, first determine the smallest power of two larger than the maximum of the absolute values of all of the coefficients over all frames. Then each coefficient is encoded as:
coef / twoPower
RE APPL Chunk
- 00-03 'APPL'
Start of 'APPL' chunk.
- 04-07 ckSize
Size of 'APPL' chunk (== 16)
- 08-11 'LiPC'
Indicates RE analysis information.
- 12-15 frames
Number of sets of resonant filter coefficients.
- 16-19 numberCoefficients
Number of resonant filter coefficients per frame.
- 20-23 shift
Coefficient scale (== log2 ( twoPower )).

