Methods, apparatus, and systems for voice processing are provided herein. An exemplary method can be implemented by a terminal. A voice bit stream to be sent can be obtained. Voice control information corresponding to the voice bit stream to be sent can be obtained. The voice control information can be used for a voice server to determine a voice-mixing strategy. The voice bit stream and the voice control information can be sent to the voice server. At least one voice bit stream,returned by the voice server based on the voice-mixing strategy,can be received. The at least one voice bit stream can be outputted.