Skip to content

test#305

Closed
1a1a11a wants to merge 8 commits into1a1a11a:developfrom
LauYeeYu:compute
Closed

test#305
1a1a11a wants to merge 8 commits into1a1a11a:developfrom
LauYeeYu:compute

Conversation

@1a1a11a
Copy link
Owner

@1a1a11a 1a1a11a commented Mar 14, 2026

No description provided.

@1a1a11a 1a1a11a requested a review from Copilot March 14, 2026 01:12
@1a1a11a 1a1a11a self-assigned this Mar 14, 2026
@1a1a11a 1a1a11a requested a review from haochengxia as a code owner March 14, 2026 01:12
@cursor
Copy link

cursor bot commented Mar 14, 2026

You have used all of your free Bugbot PR reviews.

To receive reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

@gemini-code-assist
Copy link

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly expands the libCacheSim framework by integrating advanced cache eviction algorithms. The primary goal is to enable more sophisticated cache simulations that account for factors beyond simple access frequency or recency, such as the computational cost (compute intensity) associated with fetching or processing an object, and object size. These additions provide researchers and developers with more powerful tools to model and optimize cache performance in diverse scenarios.

Highlights

  • New Eviction Algorithms: Introduced several new cache eviction algorithms: BeladyCompute, LHD_compute, S3FIFOCompute, and S3FIFOSize. These algorithms incorporate 'compute intensity' and/or object size awareness into their eviction logic.
  • Compute Intensity Integration: Enhanced existing data structures (Belady_obj_metadata_t, cache_obj union) to store 'compute_intensity' for use by the new algorithms, allowing for more nuanced eviction decisions based on computational cost.
  • Eviction Process Recording: Added functionality to record the eviction process, including evicted object IDs, when the RECORD_EVICTION_PROCESS macro is defined. This feature aids in debugging and analyzing cache behavior.
  • Build System Updates: Updated the CMake build configuration to include the source files for the newly added eviction algorithms, ensuring they are compiled and linked correctly into the libCacheSim library.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • libCacheSim/bin/cachesim/cache_init.h
    • Registered new cache initialization functions for 'lhd_compute' and 'beladyCompute' algorithms.
    • Implemented logic to restrict 'beladyCompute' usage to specific trace formats ('oracleGeneral' and 'lcs') and provide user warnings for unsupported traces.
  • libCacheSim/cache/CMakeLists.txt
    • Added eviction/BeladyCompute.c to the list of C eviction sources.
    • Included eviction/S3FIFOSize.c and eviction/S3FIFOCompute.c in the C eviction sources.
    • Appended eviction/cpp/GDSF_compute.cpp, eviction/LHD/lhd_compute.cpp, and eviction/LHD/LHD_compute_interface.cpp to the C++ eviction sources.
  • libCacheSim/cache/cache.c
    • Introduced a conditional compilation block (#ifdef RECORD_EVICTION_PROCESS) to enable recording of eviction events.
    • Added functions set_new_record_eviction_process_file and print_eviction_debug_message for managing and writing to an eviction process log file.
    • Integrated logging of evicted object IDs into cache_remove_obj_base when eviction recording is active.
  • libCacheSim/cache/eviction/BeladyCompute.c
    • Added a new eviction algorithm, BeladyCompute, which considers both next access time and 'compute intensity' for eviction decisions.
    • Implemented logic for both exact and sampled Belady eviction, with parameters for controlling sampling size.
  • libCacheSim/cache/eviction/LHD/LHD_compute_interface.cpp
    • Created a C-compatible interface for the C++ LHDCompute eviction algorithm.
    • Implemented LHD_compute_init, LHD_compute_free, LHD_compute_get, LHD_compute_find, LHD_compute_insert, LHD_compute_to_evict, LHD_compute_evict, and LHD_compute_remove functions to integrate the C++ logic into the C framework.
  • libCacheSim/cache/eviction/LHD/lhd_compute.cpp
    • Implemented the core LHDCompute eviction algorithm, which dynamically adapts its policy based on observed hit and eviction rates.
    • Incorporated 'compute intensity' into the object ranking mechanism.
    • Included methods for reconfiguring the policy, adapting age coarsening, and updating class statistics.
  • libCacheSim/cache/eviction/LHD/lhd_compute.hpp
    • Defined the LHDCompute C++ class, including internal data structures like Tag (for object metadata) and Class (for tracking hit/eviction statistics).
    • Declared methods for updating, replacing, and ranking cache candidates, with specific consideration for 'compute intensity'.
  • libCacheSim/cache/eviction/S3FIFOCompute.c
    • Added a new S3-FIFO eviction algorithm variant, S3FIFOCompute, which prioritizes objects with higher 'compute intensity' for promotion to the main cache.
    • Modified the admission and eviction logic to factor in 'compute intensity' when calculating promotion thresholds.
  • libCacheSim/cache/eviction/S3FIFOSize.c
    • Added a new S3-FIFO eviction algorithm variant, S3FIFOSize, which considers object size when determining promotion to the main cache.
    • Adjusted the admission and eviction logic to use object size relative to the mean object size for promotion thresholds.
  • libCacheSim/cache/eviction/cpp/GDSF_compute.cpp
    • Implemented a new GDSF (Greedy Dual-Size Frequency) variant, GDSF_compute, that incorporates 'compute intensity' into its priority calculation.
    • Modified the priority calculation to use a combination of object frequency and the square root of 'compute intensity'.
  • libCacheSim/include/libCacheSim/cache.h
    • Declared external functions set_new_record_eviction_process_file and print_eviction_debug_message under the RECORD_EVICTION_PROCESS macro.
  • libCacheSim/include/libCacheSim/cacheObj.h
    • Added a compute_intensity field to the Belady_obj_metadata struct.
    • Defined a new struct GDSF_compute_obj_metadata_t to hold 'compute_intensity' for the GDSF_compute algorithm.
    • Extended the cache_obj union to include the GDSF_compute metadata field.
  • libCacheSim/include/libCacheSim/evictionAlgo.h
    • Declared the initialization functions for the new eviction algorithms: BeladyCompute_init, GDSF_compute_init, LHD_compute_init, and S3FIFOCompute_init.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@1a1a11a 1a1a11a marked this pull request as draft March 14, 2026 01:12
@1a1a11a 1a1a11a removed the request for review from haochengxia March 14, 2026 01:14
Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces several new compute-aware cache eviction algorithms, including BeladyCompute, LHD_compute, S3FIFOCompute, and GDSF_compute. The changes are extensive and add significant new functionality. However, the review identified a critical issue with potential infinite recursion in BeladyCompute, several high-severity out-of-bounds memory access risks due to missing checks before accessing request features, and some medium-severity issues related to thread-safety and logical inconsistencies. These issues should be addressed to ensure the stability and correctness of the new algorithms.

(unsigned long long)cache->get_n_obj(cache),
(long long)cache->cache_size, (long long)req->obj_size,
params->n_sample, obj_to_evict_score, sampled_obj_score);
return BeladyCompute_to_evict(cache, req);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The recursive call to BeladyCompute_to_evict(cache, req) when obj_to_evict is NULL can lead to infinite recursion and a stack overflow. This can happen if the cache is empty or if random sampling consistently fails to find a suitable eviction candidate. You should add a base case to prevent this, for example by using an iterative approach or limiting the recursion depth.

tag->timestamp = timestamp;
tag->app = DEFAULT_APP_ID % APP_CLASSES;
tag->size = 1;
tag->compute_intensity = req->features[0];

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The code directly accesses req->features[0] without checking if req->n_features > 0. If a request arrives with no features, this will result in an out-of-bounds memory access, which is undefined behavior and can lead to a crash. You should add a check before accessing the feature.

  tag->compute_intensity = (req->n_features > 0) ? req->features[0] : 1;

// Higher obj_size -> higher ratio -> harder to promote
// Modified: ratio = mean_obj_size_in_small / req_compute_intensity
// Higher compute_intensity -> lower ratio -> easier to promote
double compute_intensity = req->features[0];

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The code directly accesses req->features[0] without checking if req->n_features > 0. If a request arrives with no features, this will result in an out-of-bounds memory access, which is undefined behavior and can lead to a crash. This issue is present in multiple places in this file. You should add a check before accessing the feature.

    double compute_intensity = (req->n_features > 0) ? req->features[0] : 1.0;

cache_obj_t *obj = cache_insert_base(cache, req);
DEBUG_ASSERT(obj != nullptr);
obj->misc.freq = 1;
obj->GDSF_compute.compute_intensity = req->features[0];

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The code directly accesses req->features[0] without checking if req->n_features > 0. If a request arrives with no features, this will result in an out-of-bounds memory access, which is undefined behavior and can lead to a crash. You should add a check before accessing the feature.

  obj->GDSF_compute.compute_intensity = (req->n_features > 0) ? req->features[0] : 1;

Comment on lines +14 to +32
#ifdef RECORD_EVICTION_PROCESS
FILE *eviction_process_ofile = NULL;

void set_new_record_eviction_process_file(const char *ofilepath) {
if (eviction_process_ofile != NULL) {
fclose(eviction_process_ofile);
}
eviction_process_ofile = fopen(ofilepath, "w");
if (eviction_process_ofile == NULL) {
ERROR("cannot open eviction process file %s\n", ofilepath);
}
}

void print_eviction_debug_message(const char *msg) {
if (eviction_process_ofile != NULL) {
fprintf(eviction_process_ofile, "%s\n", msg);
}
}
#endif /* RECORD_EVICTION_PROCESS */

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The RECORD_EVICTION_PROCESS feature uses a global file pointer eviction_process_ofile. This is not thread-safe. If multiple threads use the cache simulator simultaneously, they will race on this global variable, leading to corrupted output or crashes. Consider making this thread-local using __thread or protecting access with a mutex if multi-threaded use is intended.

Comment on lines +140 to +142
if (req->n_features > 0) {
obj->Belady.compute_intensity = req->features[0];
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

In BeladyCompute_find, when req->n_features <= 0, the compute_intensity of the object is not updated. This is inconsistent with BeladyCompute_insert, which sets a default value of 1 in this case. An object might retain a stale compute_intensity from a previous access. It would be more robust to also set a default value here.

    if (req->n_features > 0) {
      obj->Belady.compute_intensity = req->features[0];
    } else {
      obj->Belady.compute_intensity = 1;
    }

obj->misc.freq = 1;
obj->GDSF_compute.compute_intensity = req->features[0];

double pri = gdsf->pri_last_evict + req->features[0];

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The priority calculation for a new object on insert seems inconsistent with the priority update on a cache hit. On a hit (in GDSF_compute_find), the priority is updated using sqrt(compute_intensity). However, on insert, the priority is calculated using compute_intensity directly. For consistency, you should probably use sqrt here as well.

  double pri = gdsf->pri_last_evict + sqrt((double)obj->GDSF_compute.compute_intensity);

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds compute-intensity-aware variants of several cache eviction algorithms (Belady, S3FIFO, GDSF, LHD) to libCacheSim. These variants use req->features[0] as a compute intensity signal to bias eviction decisions toward keeping compute-intensive objects cached longer.

Changes:

  • New eviction algorithms: BeladyCompute, S3FIFOCompute, S3FIFOSize, GDSF_compute, LHD_compute
  • Debug infrastructure for recording eviction processes (RECORD_EVICTION_PROCESS)
  • Registration of some (but not all) new algorithms in the cachesim binary

Reviewed changes

Copilot reviewed 12 out of 13 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
evictionAlgo.h Declares init functions for new algorithms
cacheObj.h Adds compute_intensity fields to cache object metadata
cache.h Adds optional eviction process recording API
cache.c Implements eviction recording; logs evicted obj IDs
BeladyCompute.c Belady variant using reuse_distance/compute_intensity scoring
S3FIFOSize.c Size-aware S3-FIFO variant
S3FIFOCompute.c Compute-intensity-aware S3-FIFO variant
GDSF_compute.cpp GDSF variant incorporating compute intensity into priority
lhd_compute.hpp/cpp LHD variant scaling hit density by compute intensity
LHD_compute_interface.cpp C interface for LHD compute variant
CMakeLists.txt Registers new source files for build
cache_init.h Registers lhd_compute and beladyCompute in cachesim CLI

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

if (compute_intensity <= 0) compute_intensity = 1.0; // Avoid division by zero
double ratio = mean_obj_size_in_small / compute_intensity;

cache_obj_t *ghost_obj = ghost_q->find(ghost_q, req, false);
Comment on lines +392 to +394
// MODIFIED: Apply compute intensity logic
double compute_intensity = req->features[0];
if (compute_intensity <= 0) compute_intensity = 1.0; // Avoid division by zero
Comment on lines +428 to +429
// MODIFIED: Apply compute intensity logic
double compute_intensity = req->features[0];
Comment on lines +261 to +267
#endif
return obj;
}

obj = params->main->find(params->main, req, true);
if (obj != NULL) {
obj->S3FIFO.freq += 1;
Comment on lines +178 to +183
obj = params->ghost->find(params->ghost, req, false);
if (obj == NULL) {
obj = params->ghost->insert(params->ghost, req);
obj->S3FIFO.freq = 1;
} else {
obj->S3FIFO.freq += 1;
Comment on lines +179 to +184
obj = params->ghost->find(params->ghost, req, false);
if (obj == NULL) {
obj = params->ghost->insert(params->ghost, req);
obj->S3FIFO.freq = 1;
} else {
obj->S3FIFO.freq += 1;

double ratio = (double)req->obj_size / mean_obj_size_in_small;

cache_obj_t *ghost_obj = ghost_q->find(ghost_q, req, false);

char *params_str = strdup(cache_specific_params);
char *old_params_str = params_str;
char *end;
cache_evict_base(cache, obj_to_evict, true);
}

bool BeladyCompute_remove(cache_t *cache, const obj_id_t obj_id) {
@1a1a11a 1a1a11a closed this Mar 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants