CHNOV_FACTOR
Define a chunk novelty factor to manipulate how many of a test item's bigrams, etc., were not present in any training items.
Contents
Basic chunk novelty factor
CHNOV_FACTOR(aglss, 'fname') defines target values for low and high chunk novelty test items, containing 0 or 1 novel bigrams, respectively. The selection of training items will avoid some bigrams that would otherwise be allowed by the grammar.
Chunk novelty is the number of a test item's bigrams (or chunks of a different size, if specified) that do not appear in training items.
The example below generates training items based on the XXX_GRAMMAR. Test items are then generated that have either low or high novelty relative to bigrams in the training items.
The first output variable returned by CHNOV_FACTOR is an AGLSS object, updated to reflect the chunk novelty factor. The second output variable is a cell array of strings naming the different levels of chunk novelty defined. Actual chunk novelty target values are returned as the third output variable.
Only two of the four possible bigrams occur in the training items, even though they would be allowed by the grammar. By default, CHNOV_FACTOR excludes two bigrams from training so that they can be used as novel bigrams in test items.
The display of test items lists chunk novelty scores for unigrams, bigrams, and trigrams, along with the target values. Since targets are specified here only for bigrams, the target values for unigrams and trigrams are listed as 'NaN' (for "not a number").
s_xxx = aglss(xxx_grammar, [3 10]); [s, levnames, tgts] = chnov_factor(s_xxx, 'ChNov'); levnames tgts s = factorial_testsets(s, {'ChNov', levnames{:}}); s = choose_items(s, 6, 2); disp('Training items:'); disp(format_train_items(s)); disp('Test items:'); disp(format_test_items(s));
Potential items: Grammar involves 2 symbols (xy) 2040 possible strings of length 3-10 180 grammatical strings ( 8.82%) 1860 ungrammatical strings (91.18%) Using all 180 grammatical strings Using all 1860 ungrammatical strings 1 of 8 chunks of length 3 appear in no grammatical items levnames = 'LowChNov' 'HighChNov' tgts = 0 1 Choosing training item 1.... Choosing training item 2.... Choosing training item 3.... Choosing training item 4.... Updating potential items.... Choosing test item 1 for each set... 1 2. Choosing training item 5.... Updating potential items.... Choosing test item 2 for each set... 2 1. Choosing training item 6.... Training items: Itm_num Itm_name 01 xyxy 02 xyxyxyxyxy 03 xyx 04 yxy 05 xyxyxyxy 06 xyxyxy Test items: Tset_num ChNov_cat Itm_num Itm_name ChNov_1 ChNov_2 ChNov_3 ChNov_tgt_1 ChNov_tgt_2 ChNov_tgt_3 01 LowChNov 01 yxyxy 0.000 0.000 0.000 NaN 0.000 NaN 01 LowChNov 02 yxyx 0.000 0.000 0.000 NaN 0.000 NaN 02 HighChNov 01 xyyxyx 0.000 1.000 2.000 NaN 1.000 NaN 02 HighChNov 02 xxyxyxyx 0.000 1.000 1.000 NaN 1.000 NaN
Chunk novelty categories
CHNOV_FACTOR(aglss, 'fname', [z1 z2 ... zN]), specifies chunk novelty values for N categories. Test items in category I will have Z(I) novel bigrams (or chunks of a different size, if specified).
The example below builds on the AGLSS object S_XXX created for the previous example. Three levels of chunk novelty are defined, for test items containing 0, 1, or 2 novel bigrams.
[s, levnames, tgts] = chnov_factor(s_xxx, 'MyChunkNov', [0 1 2]); levnames tgts s = factorial_testsets(s, {'MyChunkNov', levnames{:}}); s = choose_items(s, 6, 2); disp('Training items:'); disp(format_train_items(s)); disp('Test items:'); disp(format_test_items(s));
1 of 8 chunks of length 3 appear in no grammatical items levnames = 'MyChunkNov1' 'MyChunkNov2' 'MyChunkNov3' tgts = 0 1 2 Choosing training item 1.... Choosing training item 2.... Choosing training item 3.... Choosing training item 4.... Updating potential items.... Choosing test item 1 for each set... 2 1 3. Choosing training item 5.... Updating potential items.... Choosing test item 2 for each set... 2 3 1. Choosing training item 6.... Training items: Itm_num Itm_name 01 xyxy 02 xyxyxyxyxy 03 xyx 04 yxy 05 xyxyxyxy 06 xyxyxy Test items: Tset_num MyChunkNov_cat Itm_num Itm_name MyChunkNov_1 MyChunkNov_2 MyChunkNov_3 MyChunkNov_tgt_1 MyChunkNov_tgt_2 MyChunkNov_tgt_3 01 MyChunkNov1 01 yxyxy 0.000 0.000 0.000 NaN 0.000 NaN 01 MyChunkNov1 02 yxyx 0.000 0.000 0.000 NaN 0.000 NaN 02 MyChunkNov2 01 xyyxyx 0.000 1.000 2.000 NaN 1.000 NaN 02 MyChunkNov2 02 xxyxyxyx 0.000 1.000 1.000 NaN 1.000 NaN 03 MyChunkNov3 01 yxxyyxyx 0.000 2.000 4.000 NaN 2.000 NaN 03 MyChunkNov3 02 yxyxyyxyyx 0.000 2.000 4.000 NaN 2.000 NaN
Naming chunk novelty categories
CHNOV_FACTOR(aglss, 'fname', T, {'name1', 'name2', ...}) specifies names for the different categories of chunk novelty, as an alternative to the default names otherwise assigned.
The example below builds on the AGLSS object S_XXX created for a previous example. The default levels of chunk novelty are named 'L' and 'H' (for Low and High). These names appear in the display of test items.
[s, levnames, tgts] = chnov_factor(s_xxx, 'ChNov', [], {'L', 'H'}); levnames tgts s = factorial_testsets(s, {'ChNov', levnames{:}}); s = choose_items(s, 6, 2); disp('Training items:'); disp(format_train_items(s)); disp('Test items:'); disp(format_test_items(s));
1 of 8 chunks of length 3 appear in no grammatical items levnames = 'L' 'H' tgts = 0 1 Choosing training item 1.... Choosing training item 2.... Choosing training item 3.... Choosing training item 4.... Updating potential items.... Choosing test item 1 for each set... 2 1. Choosing training item 5.... Updating potential items.... Choosing test item 2 for each set... 1 2. Choosing training item 6.... Training items: Itm_num Itm_name 01 xyxy 02 xyxyxyxyxy 03 xyx 04 yxy 05 xyxyxyxy 06 xyxyxy Test items: Tset_num ChNov_cat Itm_num Itm_name ChNov_1 ChNov_2 ChNov_3 ChNov_tgt_1 ChNov_tgt_2 ChNov_tgt_3 01 L 01 yxyxy 0.000 0.000 0.000 NaN 0.000 NaN 01 L 02 yxyx 0.000 0.000 0.000 NaN 0.000 NaN 02 H 01 xyyxyx 0.000 1.000 2.000 NaN 1.000 NaN 02 H 02 xxyxyxyx 0.000 1.000 1.000 NaN 1.000 NaN
Reserving novel grammatical chunks
CHNOV_FACTOR(aglss, 'fname', T, NAMES, M) specifies the minimum number of grammatical bigrams to exclude from training items, over and above any ungrammatical bigrams. M determines the number of different bigrams (or chunks of a different length, if specified) that will be available to appear in test items as novel bigrams which did not appear in any training items. By default, M is 2.
The example below builds on the AGLSS object S_XXX created for a previous example. Here, chunk novelty is combined factorially with grammaticality to generate four sets of test items. The chunk novelty factor reserves a single grammatical bigram which is excluded from training items. This is the only novel bigram available to grammatical test items.
[s, levnames, tgts] = chnov_factor(s_xxx, 'ChNov', [], [], 1); levnames tgts [s, glevs] = gram_factor(s, 'gram'); s = factorial_testsets(s, {'ChNov', levnames{:}}, {'gram', glevs{:}}); s = choose_items(s, 6, 2); disp('Training items:'); disp(format_train_items(s)); disp('Test items:'); disp(format_test_items(s));
1 of 8 chunks of length 3 appear in no grammatical items levnames = 'LowChNov' 'HighChNov' tgts = 0 1 Choosing training item 1.... Choosing training item 2.... Choosing training item 3.... Choosing training item 4.... Updating potential items.... Choosing test item 1 for each set... 3 2 4 1. Choosing training item 5.... Updating potential items.... Choosing test item 2 for each set... 2 3 4 1. Choosing training item 6.... Training items: Itm_num Itm_name 01 xyxy 02 yxy 03 xxy 04 yxyxxxy 05 xyxyxxxy 06 yxyxxxxy Test items: Tset_num ChNov_cat gram_cat Itm_num Itm_name ChNov_1 ChNov_2 ChNov_3 ChNov_tgt_1 ChNov_tgt_2 ChNov_tgt_3 01 LowChNov G 01 xyxyxxxxx 0.000 0.000 0.000 NaN 0.000 NaN 01 LowChNov G 02 yxyxxxx 0.000 0.000 0.000 NaN 0.000 NaN 02 HighChNov G 01 yxyxxyxyyx 0.000 1.000 2.000 NaN 1.000 NaN 02 HighChNov G 02 yxyyxyxxxy 0.000 1.000 2.000 NaN 1.000 NaN 03 LowChNov NG 01 xxxyxxy 0.000 0.000 0.000 NaN 0.000 NaN 03 LowChNov NG 02 xyxxxyxxx 0.000 0.000 0.000 NaN 0.000 NaN 04 HighChNov NG 01 yxxyyxyx 0.000 1.000 2.000 NaN 1.000 NaN 04 HighChNov NG 02 xxxxyxxyy 0.000 1.000 1.000 NaN 1.000 NaN
Reserving novel chunks regardless of grammaticality
CHNOV_FACTOR(aglss, 'fname', T, NAMES, [N M]) specifies the minimum number of chunks (N) to exclude from training items whether those chunks are grammatical or not, in addition to the minimum number of grammatical chunks (M) to exclude. In particular, M can be set to 0 if grammaticality is not going to be used as a stimulus factor.
The example below builds on the AGLSS object S_XXX created for a previous example. Here, a single bigram is specifically excluded from training items, but it would not necessarily be a grammatical bigram. However, the XXX_GRAMMAR allows all eight of the possible bigrams, so in this case training items will have to avoid a grammatical bigram to leave a novel bigram available to items in the test set.
[s, levnames, tgts] = chnov_factor(s_xxx, 'ChNov', [], [], [1 0]); levnames tgts s = factorial_testsets(s, {'ChNov', levnames{:}}); s = choose_items(s, 6, 2); disp('Training items:'); disp(format_train_items(s)); disp('Test items:'); disp(format_test_items(s));
1 of 8 chunks of length 3 appear in no grammatical items levnames = 'LowChNov' 'HighChNov' tgts = 0 1 Choosing training item 1.... Choosing training item 2.... Choosing training item 3.... Choosing training item 4.... Updating potential items.... Choosing test item 1 for each set... 2 1. Choosing training item 5.... Updating potential items.... Choosing test item 2 for each set... 2 1. Choosing training item 6.... Training items: Itm_num Itm_name 01 xyxy 02 yxy 03 xxy 04 yxyxxxy 05 xyxyxxxy 06 yxyxxxxy Test items: Tset_num ChNov_cat Itm_num Itm_name ChNov_1 ChNov_2 ChNov_3 ChNov_tgt_1 ChNov_tgt_2 ChNov_tgt_3 01 LowChNov 01 xyxyxxxxx 0.000 0.000 0.000 NaN 0.000 NaN 01 LowChNov 02 xxxyxxy 0.000 0.000 0.000 NaN 0.000 NaN 02 HighChNov 01 yxxyyxyx 0.000 1.000 2.000 NaN 1.000 NaN 02 HighChNov 02 xxxxyxxyy 0.000 1.000 1.000 NaN 1.000 NaN
Chunk size used to compute chunk novelty
CHNOV_FACTOR(aglss, 'fname', T, NAMES, NM, CHSIZE) specifies the length of chunks on which to compute chunk novelty.
The example below builds on the AGLSS object S_XXX created for a previous example. Here, chunk novelty is defined on trigrams rather than bigrams. A minimum of two trigrams (0 grammatical ones) are excluded from training items. The XXX_GRAMMAR leaves one non-grammatical trigram. That one plus one grammatical trigram will be excluded from training items to leave two novel trigrams available to items in the test set.
[s, levnames, tgts] = chnov_factor(s_xxx, 'TrigramNov', [], [], [2 0], 3); levnames tgts s = factorial_testsets(s, {'TrigramNov', levnames{:}}); s = choose_items(s, 6, 2); disp('Training items:'); disp(format_train_items(s)); disp('Test items:'); disp(format_test_items(s));
1 of 8 chunks of length 3 appear in no grammatical items levnames = 'LowTrigramNov' 'HighTrigramNov' tgts = 0 1 Choosing training item 1.... Choosing training item 2.... Choosing training item 3.... Choosing training item 4.... Updating potential items.... Choosing test item 1 for each set... 2 1. Choosing training item 5.... Updating potential items.... Choosing test item 2 for each set... 1 2. Choosing training item 6.... Training items: Itm_num Itm_name 01 yxyxxxyxyy 02 yxyy 03 xxxyxyy 04 yxyxxxy 05 yxyxxxxy 06 xxyxyy Test items: Tset_num TrigramNov_cat Itm_num Itm_name TrigramNov_1 TrigramNov_2 TrigramNov_3 TrigramNov_tgt_1 TrigramNov_tgt_2 TrigramNov_tgt_3 01 LowTrigramNov 01 xxxxyxxyy 0.000 0.000 0.000 NaN NaN 0.000 01 LowTrigramNov 02 xyxyxxxxx 0.000 0.000 0.000 NaN NaN 0.000 02 HighTrigramNov 01 yxxyyxyx 0.000 0.000 1.000 NaN NaN 1.000 02 HighTrigramNov 02 yxxxxxyyy 0.000 0.000 1.000 NaN NaN 1.000