Skip to content

Conversation

@ozraru
Copy link
Contributor

@ozraru ozraru commented Jan 18, 2025

I've rewrited entire code related to Voice.

This is very big pull request.
I'm sorry, I understand that it is difficult to review.
But I believe that this is valuable to check.

I'm not good at English. Wording fixes are welcome.
I'm not sure about the naming. Naming changes are also welcome.

I want to support E2EE in the future but it's not included in this pull request.


Summary

  • Use context to manage goroutines and connections

  • Eliminate busy wait using sync.Cond

  • Use Voice Gateway Version 8

  • Change to new transport encryption mode

    • aead_aes256_gcm_rtpsize and aead_xchacha20_poly1305_rtpsize
  • Resume Voice Conenction

  • Delete reconnection (excludes resume)

    • It is very rare and it should be handled by an application
  • Deliver unrecoverable error to the application

  • Edit some examples to work with these changes

Breaking changes

  • misc
    • Some functions require context.Context in argument
      • It can be used to set timeout
  • Session
    • Renamed ChannelVoiceJoinManual to VoiceStateUpdate
    • Remove ShouldReconnectVoiceOnSessionError
      • Because reconnection are deleted
  • VoiceConenction
    • Removed Ready
      • Use Status instead
    • Removed Debug
      • already deprecated
    • Removed ChangeChannel
      • Use Session.VoiceStateUpdate
    • Removed Close
      • Kill is alternative, but why not use Disconnect?
  • Packet
    • Removed Type
      • Separated to Flags and PayloadType

note: Currently using transport encryption mode(xsalsa20_poly1305) and voice gateway version(connecting without a version) is already discontinued according to the document. (but it works currently 🤔 )

Versions below 4 as well as the default version behaviour will be discontinued as of November 18th, 2024. Connections without a version or with a version less than version 4 will be rejected as of this date.
https://discord.com/developers/docs/topics/voice-connections#voice-gateway-versioning

The deprecated encryption modes above will be discontinued as of November 18th, 2024. As of this date the voice gateway will not allow you to connect with one of the deprecated encryption modes.
https://discord.com/developers/docs/topics/voice-connections#transport-encryption-and-sending-voice

ozraru and others added 5 commits January 19, 2025 00:19
## Summary
- Use context to manage goroutines and connections
- Eliminate busy wait using sync.Cond
- Use Voice Gateway Version 8
- Change to new encryption mode
- Resume Voice Conenction
- Delete reconnect (excludes resume)
  - It is very rare and it should be handled by an application
- Deliver unrecoverable error to the application

- Edit some examples to work with these changes

## Breaking changes
- misc
  - Some functions require context.Context in argument
- Session
  - Renamed ChannelVoiceJoinManual to VoiceStateUpdate
  - Remove ShouldReconnectVoiceOnSessionError
    - Because reconnecting feature are deleted
- VoiceConenction
  - Removed Ready
    - Use Status instead
  - Removed Debug
    - already deprecated
  - Removed ChangeChannel
    - Use Session.VoiceStateUpdate
  - Removed Close
    - Kill is alternative, but why not use Disconnect?
- Packet
  - Removed Type
    - Separated to Flags and PayloadType
@RadCraftplay
Copy link

RadCraftplay commented Sep 12, 2025

Thank you! Discord starts phasing out old encryption and some A/B testing shenanigans are going on right now. As a result certain voice channels do not work with master version of discordgo. This patch has fixed voice for me. I hope others in need notice. Also other problems like random reconnects are gone.

if !v.deaf {
if v.OpusRecv == nil {
v.OpusRecv = make(chan *Packet, 2)
case 2: // READY
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since you are doing overhaul let's fix few extra things and define these values in case statments as consts

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be used only here, so why comment is not enough?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm referring to all cases in this switch statement. For me, it's a preference and will help avoid confusion in the future. case 5: does not have a comment, for example. Additionally, people will change code but will forget to change comments which may cause confusion, if we have consts in place of 2, 3, 4, 5 code will be more self documenting and comments like // READY won't be needed.

Of course these are the rules i like to follow, i'm just suggesting, not requiring.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, it's better to declare these values as constants.

if v.sessionID != "" {
break
}
go v.websocket(context.TODO(), *ev.Endpoint, ev.Token)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the plan with this context.TODO()?
VoiceConnection.websocket is used only here so maybe remove context.Context argument and initialize it inside VoiceConnection.websocket function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Normally, top-level context should be generated in the top-level function (= main() of the application), but discordgo(.Session) doesn't accept it. So it is 'TODO'.
  2. Functions related to Session (ex: Session.Open, Session.listen) may also have context to manage the lifecycle, like this patch.
    It may deprecate Session.Close() and replace with cancel of context, and passing that context to here would resolve TODO: Add support for Voice WS/UDP (in the comment of Session.Close)
    It is out of the scope of this patch, but websocket() supports being canceled from the caller.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, sounds good

voice.go Outdated
go v.reconnect()
// 4014 indicates a manual disconnection by someone in the guild;
// we shouldn't reconnect.
if websocket.IsCloseError(err, 4014) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we could name 4014 and 4015 too?
const manualDisconnect = 4014 and ourCodeIsBad = 4015

@fr-str
Copy link

fr-str commented Sep 17, 2025

This fixed the issues for me, good job.
Few comments with minor tweaks and suggestions

ozraru and others added 3 commits September 18, 2025 04:37
Co-authored-by: fr-str <73256873+fr-str@users.noreply.github.com>
Co-authored-by: fr-str <73256873+fr-str@users.noreply.github.com>
Co-authored-by: fr-str <73256873+fr-str@users.noreply.github.com>
Copy link
Contributor Author

@ozraru ozraru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your review!
I wrote response or merged each comment or suggestion.

if v.sessionID != "" {
break
}
go v.websocket(context.TODO(), *ev.Endpoint, ev.Token)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Normally, top-level context should be generated in the top-level function (= main() of the application), but discordgo(.Session) doesn't accept it. So it is 'TODO'.
  2. Functions related to Session (ex: Session.Open, Session.listen) may also have context to manage the lifecycle, like this patch.
    It may deprecate Session.Close() and replace with cancel of context, and passing that context to here would resolve TODO: Add support for Voice WS/UDP (in the comment of Session.Close)
    It is out of the scope of this patch, but websocket() supports being canceled from the caller.

if !v.deaf {
if v.OpusRecv == nil {
v.OpusRecv = make(chan *Packet, 2)
case 2: // READY
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be used only here, so why comment is not enough?

MohmmedAshraf added a commit to MohmmedAshraf/discordgo that referenced this pull request Oct 5, 2025
This branch contains the voice overhaul changes from PR bwmarrin#1593 that fix:
- Discord voice connection 'Unknown encryption mode' error (close code 4016)
- Modern encryption mode negotiation (aead_aes256_gcm_rtpsize, aead_xchacha20_poly1305_rtpsize)
- Improved voice connection stability
- Resolves Speaker 2 connection failures in ShotCaller bot

Based on: bwmarrin#1593
Applied for: ShotCaller Discord bot voice relay system
MohmmedAshraf added a commit to MohmmedAshraf/discordgo that referenced this pull request Oct 5, 2025
This branch contains the voice overhaul changes from PR bwmarrin#1593 that fix:
- Discord voice connection 'Unknown encryption mode' error (close code 4016)
- Modern encryption mode negotiation (aead_aes256_gcm_rtpsize, aead_xchacha20_poly1305_rtpsize)
- Improved voice connection stability
- Resolves Speaker 2 connection failures in ShotCaller bot

Based on: bwmarrin#1593
Applied for: ShotCaller Discord bot voice relay system
v.Cond.Broadcast()
close(v.dead)
go func() {
time.Sleep(100 * time.Millisecond) // safe
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a hardcoded value?


return
}
for i := 0; i < 100; i++ {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The number of reconnection attempts should be configurable.

Comment on lines +217 to +218
v.Kill()
v.Disconnect(context.Background())
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this safe?

Comment on lines -405 to +502
// Start reconnect goroutine then exit.
go v.reconnect()
// reconnect
break
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old code used to have

	<-time.After(wait * time.Second)
	wait *= 2

Should a wait be added here?

Comment on lines +175 to +177
time.Sleep(100 * time.Millisecond) // safe
close(v.OpusRecv)
close(v.OpusSend)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a potential race condition? Are you sure it's always safe?

@sheo9000
Copy link

Could you make a proper example of voice receiver? I can't get it to work, and examples/voice_receive/main.go is not compileable

p.PayloadType undefined (type *discordgo.Packet has no field or method PayloadType)

@galberti
Copy link
Contributor

Hi,
is this going to be merged anytime soon?

@galberti
Copy link
Contributor

galberti commented Nov 1, 2025

@ozraru Your branch works well, but after I while I get:

voice.go:207:failure() voice unrecoverable error, voice websocket closed, websocket: close 4022: Disconnected: Call terminated.

and the audio connections is not recovered.

@RadCraftplay
Copy link

RadCraftplay commented Nov 1, 2025

voice.go:207:failure() voice unrecoverable error, voice websocket closed, websocket: close 4022: Disconnected: Call terminated.

and the audio connections is not recovered.

@galberti This has been fixed in #1629. These changes probably would have to be applied similarly in this branch. Vendoring discordgo and merging PRs with your fork manually is probably the best idea

@galberti
Copy link
Contributor

galberti commented Nov 1, 2025

@RadCraftplay thank you for your answer. It is indeed fixed, but merging the 2 branches together isn't the easiest of the jobs :)

@ozraru
Copy link
Contributor Author

ozraru commented Nov 1, 2025

@sheo9000

Could you make a proper example of voice receiver? I can't get it to work, and examples/voice_receive/main.go is not compileable

p.PayloadType undefined (type *discordgo.Packet has no field or method PayloadType)

It's because it's using the release version of discordgo.
Adding replace github.com/bwmarrin/discordgo => ../../ to the end of examples/voice_receive/go.mod should resolve this.

However, it's still not working in my environment and I haven't been able to find the cause. I confirmed it was working when I wrote this PR, so the cause could be a change by Discord.

@ozraru
Copy link
Contributor Author

ozraru commented Nov 1, 2025

I have (partially) merged #1629. The close code problem should be fixed.

@ozraru
Copy link
Contributor Author

ozraru commented Nov 1, 2025

From a quick look, Hedede's review looks accurate and I think that I should resolve that. However, I currently don't have enough time to review it in detail. I'll get to it when I have time.

@Fl0GUI
Copy link

Fl0GUI commented Nov 12, 2025

Adding this diff to the voice_receive example doesn't print out anything (though I doubt it did anything without this patch).

diff --git a/examples/voice_receive/main.go b/examples/voice_receive/main.go
index fbea9f3..b12317e 100644
--- a/examples/voice_receive/main.go
+++ b/examples/voice_receive/main.go
@@ -90,6 +90,10 @@ func main() {
    return
  }

+ v.AddHandler(func (vc *discordgo.VoiceConnection, vs *discordgo.VoiceSpeakingUpdate) {
+   fmt.Printf("%+v\n", vs)
+ })
+
  go func() {
    time.Sleep(10 * time.Second)
    v.Disconnect(context.Background())

When I debug this, op 5 is hit once and early returns on the lack of handlers.

I'd really like to have this working, so I'll keep debugging to try and find the cause.

@Fl0GUI
Copy link

Fl0GUI commented Nov 16, 2025

I'd really like to have this working, so I'll keep debugging to try and find the cause.

I compared packet captures between this pr and discord.js when connecting to a channel. (I can't share the pcaps for security/privacy reasons, unless I spend a lot of effort anonymising it)

Some noteworthy differences/messages:

  • when sending op: 0 identify discord.js uses max_dave_protocol_version: 1 and discordgo uses 0.
  • discordgo sends an op: 3 heartbeat with a seq_ack: -1 right after op: 0 identify while discord.js doesn't. It doesn't seem to break anything.
  • After op: 0 identify the same packages are received by both implementation: operations 2, 11, 5, 18, and 20. 18 and 20 are undocumented.
  • when sending op: 1 select protocol the discordgo address has trailing null bytes in the json string value (ie. "10.10.10.10\u0000\u0000...").
  • after the last op: 4 session description and op: 15 (undocumented), discord.js receives encrypted messages while discordgo receives nothing.

Long story short, it seems like receiving op 5: speaking is only over the dave protocol. And since discordgo doesn't implement the dave protocol, no speaking updates are received 😔
But also, the happy flow channel connection of this pr is almost identical to discord.js.

I might give the dave protocol whitepaper a read, and maybe give an attempt at an implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants