26814fb3 | 16-Apr-2025 |
HuSipeng <[email protected]> |
feat(Ftq): split Ftq meta SRAM into smaller size (#4569)
split Ftq meta SRAM into smaller size: (64 × 160) × 2 -> (64 × 80) × 4 |
e5325730 | 15-Apr-2025 |
cz4e <[email protected]> |
fix(DFT): fix `DFT` cgen connection (#4565) |
30f35717 | 14-Apr-2025 |
cz4e <[email protected]> |
refactor(DFT): refactor `DFT` IO (#4530) |
8795ffc0 | 10-Apr-2025 |
Sam Castleberry <[email protected]> |
feat: move frontend SRAM read-write conflict handling to SRAMTemplate (#4445)
Hello, this change set is to remove the SRAM read-write conflict handling logic in the frontend, after OpenXiangShan/Uti
feat: move frontend SRAM read-write conflict handling to SRAMTemplate (#4445)
Hello, this change set is to remove the SRAM read-write conflict handling logic in the frontend, after OpenXiangShan/Utility#110 has been merged, which adds this logic to the SRAMTemplate. See that pull request and also #4242 for more context.
After this change, I see microbench IPC change 1.397 -> 1.413 and coremark IPC change 2.136 -> 2.147. The branch mispredictions also decreased slightly in both.
This probably cannot be merged automatically, since the utility submodule should point to the new revision after merging instead of the revision in my branch.
Thanks, Sam
show more ...
|
1592abd1 | 08-Apr-2025 |
Yan Xu <[email protected]> |
feat: support inst lifetime trace (#4007)
PerfCCT(performance counter commit trace) is a Instruction-level granularity perfCounter like GEM5 How to use this: 1. Make with "WITH_CHISELDB=1" argument
feat: support inst lifetime trace (#4007)
PerfCCT(performance counter commit trace) is a Instruction-level granularity perfCounter like GEM5 How to use this: 1. Make with "WITH_CHISELDB=1" argument 2. Run with "--dump-db --dump-select-db lifetime", then get the database 3. Instruction lifetime visualize run "python3 scripts/perfcct.py "the-db-file-path" -p 1 -v | less" 4. Analysis script now is in XS-GEM5 repo, see https://github.com/OpenXiangShan/GEM5/blob/xs-dev/util/ClockAnalysis.py
How it works: 1. Allocate one unique tag "seqNum" like GEM5 for each instruction at fetch stage 2. Passing the "seqNum" in each pipeline 3. Recording perf data through the DPIC interface
show more ...
|
93b51ff0 | 03-Apr-2025 |
HuSipeng <[email protected]> |
fix(FTB, FTQ): dont use CPL2 SplittedSRAM (#4485)
If the frontend directly uses the SplittedSRAM of coupledL2, the frontend's SRAM will be marked as a multi-cycle path, the same as coupledL2's SRAM. |
602aa9f1 | 02-Apr-2025 |
cz4e <[email protected]> |
feat(Sram): add `SRAM_CTL` interface (#4474)
* add `SRAM_CTL` interface for SRAMTemplate * use `SRAM_WITH_CTL` to enable, e.g. `make sim-verilog CONFIG=KunminghuV2Config RELEASE=1 SRAM_WITH_CTL=
feat(Sram): add `SRAM_CTL` interface (#4474)
* add `SRAM_CTL` interface for SRAMTemplate * use `SRAM_WITH_CTL` to enable, e.g. `make sim-verilog CONFIG=KunminghuV2Config RELEASE=1 SRAM_WITH_CTL=1`
show more ...
|
af7336e5 | 31-Mar-2025 |
zhou tao <[email protected]> |
area(ICache): split ICache meta SRAM (#4468)
As per the requirements of the physical backend, split the ICache's Tag SRAM into smaller blocks. |
d6844cf0 | 28-Mar-2025 |
xu_zh <[email protected]> |
fix(IPrefetchPipe): consider backend exception as part of itlb exception (#4423)
`s1_exception_out` is for prefetch s2 only, but we want backend exception to be considered as part of itlb exception
fix(IPrefetchPipe): consider backend exception as part of itlb exception (#4423)
`s1_exception_out` is for prefetch s2 only, but we want backend exception to be considered as part of itlb exception and sent to waylookup, so we merge it to `s1_itlb_exception`
show more ...
|
721555e1 | 17-Mar-2025 |
HuSipeng <[email protected]> |
feat(FTB, FTQ): split FTB meta SRAM and FTQ meta SRAM (#4360)
FTB meta SRAM: 512 × 320 -> (512 × 40) × 8 FTQ meta SRAM: 64 × 320 -> (64 × 160) × 2 |
d7ff1926 | 12-Mar-2025 |
zhou tao <[email protected]> |
feat(ITTage,Tage): split ITTage SRAM and Tage SRAM (#4376) |
dfb03ba2 | 10-Mar-2025 |
xu_zh <[email protected]> |
fix(IFU): handle uncache corrupt (#4301)
When InstrUncache Tilelink bus gives `d.bits.corrupt` or `d.bits.denied` (included in `d.bits.corrupt`), mark the fetch block as `access fault`, and skips `m
fix(IFU): handle uncache corrupt (#4301)
When InstrUncache Tilelink bus gives `d.bits.corrupt` or `d.bits.denied` (included in `d.bits.corrupt`), mark the fetch block as `access fault`, and skips `m_resendTLB` etc..
Also: - remove `currentIsRVC` as it's actually identical with `mmio_is_RVC` - fix `crossPageIPFFix`, it should be valid only when `mmio_has_resend` - rename `mmio_resend_exception` to `mmio_exception`, since it's also used to store Tilelink corrupt before resend
Update: rebased to Feb-28-2025-66e9b546 for regression test.
show more ...
|
11269ca7 | 09-Mar-2025 |
Tang Haojin <[email protected]> |
chore: fix several deprecation warning (#4352) |
9928cec7 | 02-Mar-2025 |
zhou tao <[email protected]> |
feat(RAS): change the stall mechanism upon return stack overflow to dynamically disable the return stack. (#4317)
1. Predictor pipeline stalls exhibit poor fault tolerance. 2. Speculative queue over
feat(RAS): change the stall mechanism upon return stack overflow to dynamically disable the return stack. (#4317)
1. Predictor pipeline stalls exhibit poor fault tolerance. 2. Speculative queue overflow (requiring 32 uncommitted call/return instructions) is an extreme scenario where disabling return stack prediction incurs negligible performance impact. 3. Queue overflow often indicates recursion. In such cases, using top-of-stack data (static return addresses) may outperform IT-TAGE predictions despite disabled return stack.
show more ...
|
a67fd0f5 | 28-Feb-2025 |
Guanghui Cheng <[email protected]> |
fix(PFEvent): use `CSRModule` for distribute_csr in PFEvent (#4321) |
4b2c87ba | 27-Feb-2025 |
梁森 Liang Sen <[email protected]> |
feat(dfx): integerate dfx components (#4312) |
8882eb68 | 21-Feb-2025 |
Xin Tian <[email protected]> |
feat(bitmap/memenc): support memory isolation by bitmap checking and memory encrpty used SM4-XTS (#3980)
- Add bitmap module in MMU for memory isolation - Add memory encryption module based on AXI p
feat(bitmap/memenc): support memory isolation by bitmap checking and memory encrpty used SM4-XTS (#3980)
- Add bitmap module in MMU for memory isolation - Add memory encryption module based on AXI protoco - Can don't using these modules by setting the option `HasMEMencryption` & `HasBitmapCheck` to false
show more ...
|
fa84f222 | 18-Feb-2025 |
zhou tao <[email protected]> |
timing(icache): restore the relaxation of ICG for icache data (#4255)
Revert #4246 |
7f475a24 | 14-Feb-2025 |
HuSipeng <[email protected]> |
fix(PreDecode): fix fixedTaken for jalr (#4269)
This PR is a supplement to
https://github.com/OpenXiangShan/XiangShan/pull/4234, correctly setting
the ftqOffset when cfi is jalr. |
981114e1 | 07-Feb-2025 |
zhou tao <[email protected]> |
timing(icache): remove tag-related clock gating for timing (#4246) |
d1394225 | 27-Jan-2025 |
zhou tao <[email protected]> |
fix(RAS): adjust the signal judgment of isCall and isRet during redirection (#4232)
If the instruction is invalid, the corresponding pre-decoding information should be 0. Because when the IFU module
fix(RAS): adjust the signal judgment of isCall and isRet during redirection (#4232)
If the instruction is invalid, the corresponding pre-decoding information should be 0. Because when the IFU module detects a prediction error, the misOffset issued may not correspond to a valid instruction.
show more ...
|
c670557f | 26-Jan-2025 |
HuSipeng <[email protected]> |
fix(IFU): add range checking for instruction blocks containing jalr (#4234)
When there is a jalr instruction in the middle of an instruction block
but
the BPU fails to predict it, the IFU should a
fix(IFU): add range checking for instruction blocks containing jalr (#4234)
When there is a jalr instruction in the middle of an instruction block
but
the BPU fails to predict it, the IFU should adjust the length of the
instruction block to terminate at the jalr instruction.
However, the IFU currently does not check for this scenario, which may
result in the unintended execution of instructions following the jalr
that
should not have been executed. This PR fixed this issue.
show more ...
|
92330f9c | 24-Jan-2025 |
Easton Man <[email protected]> |
timing(frontend): remove bad timing clock gating (#4223)
- Remove `mispred_mask` from ITTAGE update logic due to timing issues
- Remove `mispred_mask` from TAGE update logic due to timing issues
-
timing(frontend): remove bad timing clock gating (#4223)
- Remove `mispred_mask` from ITTAGE update logic due to timing issues
- Remove `mispred_mask` from TAGE update logic due to timing issues
- Disable clock gating in ICacheDataArray to improve timing
show more ...
|
1fe7f8b4 | 24-Jan-2025 |
zhou tao <[email protected]> |
timing(ittage): optimize the timing of the ittage path for reading the jump address (#4216) |
6f9d4832 | 22-Jan-2025 |
HuSipeng <[email protected]> |
fix(IFU): remove useless bpu override flush logic (#4210)
When an override occurs in BPU S3 stage, the corresponding req can at
most reach the IFU F0 stage. |