Skip to content

Fix: Model downloads failing on VPN connections (SSL/TLS decrypt error). QoL: Retry button#1574

Open
NeuralFault wants to merge 8 commits intoLykosAI:mainfrom
NeuralFault:vpn-ssl-downloadfix
Open

Fix: Model downloads failing on VPN connections (SSL/TLS decrypt error). QoL: Retry button#1574
NeuralFault wants to merge 8 commits intoLykosAI:mainfrom
NeuralFault:vpn-ssl-downloadfix

Conversation

@NeuralFault
Copy link
Contributor

@NeuralFault NeuralFault commented Mar 11, 2026

This pull request introduces user-facing support for retrying failed model downloads, including UI updates, business logic, and robust handling of transient network errors. The main changes add a "Retry" button to the download manager, implement the underlying retry logic with exponential backoff, and ensure proper state management for retries.

User-facing and ViewModel changes:

  • Added a Retry button to the download manager UI, visible only when a download has failed and retry is supported (ProgressManagerPage.axaml).
    retrybutton

  • Extended PausableProgressItemViewModelBase to support retry functionality: added SupportsRetry, CanRetry, and RetryCommand properties/methods, allowing subclasses to define retry logic and expose it to the UI. [1] [2] [3]

  • Enabled retry support in DownloadProgressItemViewModel and implemented the Retry method to reset the attempt counter and re-register the download for retry. [1] [2]

Core download logic improvements:

  • Added robust detection of transient network/SSL exceptions (including AuthenticationException) as retryable, and implemented exponential backoff with jitter for automatic retries—persisting state before delay to ensure resumability. [1] [2] [3] [4]
  • Added ResetAttempts method to TrackedDownload to allow manual retry to reset the attempt counter and state cleanly.

Service layer changes:

  • Added TryRestartDownload to ITrackedDownloadService and its implementation, allowing failed downloads to be re-added to the tracking dictionary and resumed as new retry attempts. [1] [2]

When a user is on a VPN connection, the tunnel connection can be rerouted on the provider's end from time to time, which breaks the TCP download stream for a short period. (Known case with NordVPN and Proton VPN)
When this happens the current logic immediately tries to continue the download before the connection can correct itself and commonly fails through each of the 3 retries in the current logic. Leading to failed download state and user has to go back to Model Browser and manually retry the download all over again.
This keeps the 3 retries but allows time for the connection to fully reset so the download can properly resume. In the extreme case that it still fails the 3 retries, have added a retry button for QoL instead of having to manually search the model again to restart the download.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly improves the reliability and user experience of model downloads by introducing both automatic and manual retry mechanisms. It addresses issues where downloads might fail due to transient network problems, particularly common with VPN connections, by implementing intelligent error detection, exponential backoff, and a user-initiated retry option. The changes ensure that users can more easily recover from failed downloads without having to restart the entire process.

Highlights

  • User Interface Enhancement: Introduced a "Retry" button in the download manager UI, which becomes visible when a download fails and supports retrying.
  • ViewModel Retry Support: Extended PausableProgressItemViewModelBase with SupportsRetry, CanRetry, and RetryCommand to enable retry functionality for subclasses.
  • Download ViewModel Implementation: Implemented retry logic in DownloadProgressItemViewModel, allowing it to reset attempt counters and re-register failed downloads.
  • Robust Error Handling: Enhanced core download logic to detect transient network and SSL exceptions (including AuthenticationException) as retryable errors.
  • Adaptive Retry Mechanism: Implemented exponential backoff with jitter for automatic retries, persisting state before delay to ensure resumability and allow network conditions to stabilize.
  • Manual Retry State Management: Added a ResetAttempts method to TrackedDownload to cleanly reset the retry counter for manual retries.
  • Service Layer Integration: Introduced TryRestartDownload in ITrackedDownloadService and its implementation to re-add failed downloads to the tracking system for resumption.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • StabilityMatrix.Avalonia/ViewModels/Base/PausableProgressItemViewModelBase.cs
    • Added CanRetry to the NotifyPropertyChangedFor attribute list.
    • Introduced SupportsRetry virtual property, defaulting to false.
    • Added CanRetry property, which is true if SupportsRetry is true and the state is Failed.
    • Added RetryCommand and a virtual Retry method.
  • StabilityMatrix.Avalonia/ViewModels/Progress/DownloadProgressItemViewModel.cs
    • Overrode SupportsRetry to return true.
    • Implemented the Retry method to call download.ResetAttempts() and downloadService.TryRestartDownload(download).
  • StabilityMatrix.Avalonia/Views/ProgressManagerPage.axaml
    • Added a Button for retry, bound to RetryCommand and visible based on CanRetry.
  • StabilityMatrix.Core/Models/TrackedDownload.cs
    • Imported System.Security.Authentication.
    • Added IsTransientNetworkException static helper method to check for IOException or AuthenticationException (including inner and aggregate exceptions).
    • Modified the OnDownloadTaskCompleted method to use IsTransientNetworkException for retry condition.
    • Implemented exponential backoff with jitter for automatic retries, including state persistence before delay.
    • Added ResetAttempts method to reset the retry counter and set state to Inactive.
  • StabilityMatrix.Core/Services/ITrackedDownloadService.cs
    • Added Task TryRestartDownload(TrackedDownload download); to the interface.
  • StabilityMatrix.Core/Services/TrackedDownloadService.cs
    • Implemented TryRestartDownload to re-create the download's JSON file, re-add it to the tracking dictionary, and then call TryResumeDownload.
Activity
  • No human activity has been recorded on this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a robust retry mechanism for model downloads, addressing failures on unstable connections like VPNs, and includes a manual 'Retry' button and exponential backoff for automatic retries. However, the current implementation has a significant race condition where multiple download tasks can be started for the same file, potentially leading to data corruption. Additionally, the automatic retry logic does not respect user cancellation during the backoff period, and there is a resource leak in the service layer when re-registering downloads. Furthermore, some file I/O operations in async methods are synchronous, which can block the calling thread, and should be made asynchronous to maintain application responsiveness.

@NeuralFault
Copy link
Contributor Author

Give me a bit to modify and thoroughly verify

@NeuralFault
Copy link
Contributor Author

@gemini-code-assist review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively addresses download failures on unstable connections by introducing both automatic retries with exponential backoff and a manual retry option for the user. The changes are well-structured, extending the view models and services to support the new functionality. The detection of transient network errors, including SSL/TLS-related AuthenticationException, is a thoughtful addition that will improve robustness, particularly for users on VPNs. I have a couple of minor suggestions to improve maintainability by reducing code duplication and eliminating a magic number.

@NeuralFault
Copy link
Contributor Author

Ready for merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant