You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to prepare dataset splits for a problem I'm working on, and would ultimately like a hybrid of Stratified K-Fold and Grouped K-Fold. Is there a way to accomplish this using logic already built into sklearn? If not, where would be the right place for me to add it/do you have any suggestions for how to get started before I give it a go?
For a bit more context, my dataset is a considerably larger version of the following structure:
groups= [1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4] # you can think of this as a sample-idy= [1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2] # this is some trait of the samples
The objective would be to balance y across the train and test sets, and have each group only represented on one side of the folds. Thanks in advance!
The text was updated successfully, but these errors were encountered:
Hi!
I'm trying to prepare dataset splits for a problem I'm working on, and would ultimately like a hybrid of Stratified K-Fold and Grouped K-Fold. Is there a way to accomplish this using logic already built into sklearn? If not, where would be the right place for me to add it/do you have any suggestions for how to get started before I give it a go?
For a bit more context, my dataset is a considerably larger version of the following structure:
The objective would be to balance
y
across the train and test sets, and have eachgroup
only represented on one side of the folds. Thanks in advance!The text was updated successfully, but these errors were encountered: