echo is a big problem when doing audio conferencing. you either need to listen on headphones, or to add an echo suppression algorithm (which can be tricky to write). Skype for example has built in echo-suppression for this very reason.
To track down background noise, try just recording directly to a WAV file. That will tell you if the speex or network streaming code is introducing the noise or if it is coming from the microphone.