[IX] - Test run (partial run)

It is needed to be able to perform a "test run" on information extraction before actually processing the whole pipeline. At the moment, we have a single "Find suggestions" button that will train a model from the scratch and then process ALL of the entities.

When there are too many entities, this process consumes too many resources and takes a long time.

Users could greatly benefit from a "test run", that is, a subset of a maximum of (e.g.) 1000 entities are sent for training, and then the trained model is used to process some (e.g.) 1000 extra entities. This way users can check the results, add labeled data as needed and refine the model before processing the whole database.

----

### Problem statement
Currently, the information extraction pipeline processes ALL entities when a user clicks the "Find suggestions" button. This approach:
- Trains a model from scratch for each run
- Processes the entire dataset of entities at once
- Consumes excessive computational resources
- Takes a long time to complete when there are many entities
- Doesn't allow users to validate or refine the model before committing to a full run

### Proposed solution
Implement a "Test Run" feature that allows users to:
- Train the model on a limited subset of entities
- Test the trained model on another small subset of entities
- Review results and refine the model (by adding labeled data) before processing the entire dataset

### Acceptance criteria
- [ ]  Add a new "Test Run" button to the UI alongside the existing "Find suggestions" button (naming of buttons would be adjusted)
- [ ]  When "Test Run" is clicked, the system should:
   - [ ] Select a maximum of 2,000 entities for training the model (as today)
   - [ ] Train the model using only these entities
   - [ ] Process an additional 1,000 entities using the trained model (these should not be the ones used for training)
   - [ ] Display results to the user (should be displayed first in the list)
- [ ]  Users should be able to review the test results and add labeled data as needed
- [ ]  After reviewing, users should have the option to:
   - [ ] Run another test run
   - [ ] Proceed with processing the entire dataset 
- [ ]  The UI should clearly indicate when a test run is in progress vs. a full processing run


### General considerations

- The current pipeline architecture needs to be modified to support partial processing
- Need to implement logic for selecting representative subsets of entities for training and testing
- Question: After how many entities does this feature provides substantial value and is considerable different than the current default action of "Find suggestions"?

### Error states and messages

TBD

### UI designs

To be added by @juanmnl 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[IX] - Test run (partial run) #7904

Problem statement

Proposed solution

Acceptance criteria

General considerations

Error states and messages

UI designs

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[IX] - Test run (partial run) #7904

Description

Problem statement

Proposed solution

Acceptance criteria

General considerations

Error states and messages

UI designs

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions