Skip to content

Conversation

@healthkowshik
Copy link

Summary

  • Cache handling: Check past_key_values before use; use get_seq_length().
  • Rotary embedding: Refactor sequence length calculations after cache updates.
  • Time quantization: Pre-quantize sequences during dataset initialization.
  • Time values: Add time_values handling in window datasets for CT-RoPE.
  • Dictionary inputs: Support dictionary inputs with error handling.
  • Interface compliance: Implement get_sequence_length_by_idx() method.
  • Package structure: Add MIRA/init.py for package initialization.
  • Naming: Standardize evaluation strategy parameter naming.

Notes

  • Had to create a symbolic link from lowercase mira to uppercase MIRA.
cd MIRA
ln -s MIRA mira
  • Handle mismatch in package versions.
pip install --quiet -r requirements.txt
pip install --quiet torchdiffeq
pip install --quiet wandb -qU

python3 -m pip install --upgrade --quiet accelerate peft transformers

…class to fulfill TimeSeriesDataset interface requirements.
…o support dictionary input types, ensuring robust error handling for missing keys.
…on2 classes to use get_seq_length method for improved clarity and consistency.
…ity with CT-RoPE. Implement slicing and padding for time_values, and generate default values when not provided.
…RAFlashAttention2 classes to improve rotary embedding handling and ensure accurate sequence length updates after cache operations.
…ed before using cache, improving robustness of cache operations.
…g initialization. This change ensures that time values are quantized once, improving efficiency and consistency in data handling.
@healthkowshik
Copy link
Author

@microsoft-github-policy-service agree

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant