New Post: Time lost at end of MP3 file during WAV -> MP3 conversion using Media Foundation

Matt,

I've been using NAudio to convert WAV files to MP3 with some success. However, I've come across a problem that I hope you can help me with. I have some test audio files that when I use NAudio to convert to MP3 it seems not all of the compressed audio is written out. In my tests, it shows that I lose a second of audio according to my player and I can indeed hear that the WAV is slightly longer. What I've tried so far:

I. Inside MediaFoundationEncoder.ConvertOneBuffer I detect when a "partial" buffer will be used to generate a sample and pad the remaining buffer with 0's. It seems to write out the entirety of the audio data plus a section of silence. I can confirm with an audible test as well as an inspection of the binary data against the "truncated" MP3 (I've seen up to 892 extra non-zero bytes get written out when I pad). However, I'd question whether all the "zero" bytes are written out.

II. I've tried making the managedBuffer object a dynamic length to match that of the actual "read" bytes from the inputProvider. This results in the same output as your original code.

III. I've also tried adjusting the BytesToNsPosition to actually give me the bytes for the next whole "Frame" (because MP3 framerate is 38fps: 26 * waveFormat.BitsPerSample / 8). This resulted in a buffer size of 52 for my data which basically increased the time to convert and gave the same result as your base source code.

Questions:

I noticed that for the same input WAV file it appears as if the IMFSinkWriter statistics show that the qwNumSamplesReceived is greater than qwNumSamplesProcessed/Encoded. Is this expected behavior? My thought here was that the IMFSinkWriter receives a WriteSample request but won't queue the bytes for write until a "full" sample has been created and thus explains why when I pad the managedBuffer it appears to write the remaining non-zero data.
I feel that whatever problem I'm experiencing has to do with the remaining bytes read from the inputProvider (WAVReader in this case) not being able to create a "complete" sample but I don't see a way to calculate what this "complete" sample would need to be. I know that when encoding bytes it sometimes takes a minimum number of input bytes for the algorithm to work but I also feel this is being done when I call DoFinalize() on the IMFSinkWriter. Any clarity on the process the Media Foundation goes through would be helpful.

I'm going to post this message both at CodeProject and CodePlex.