@GregBernhardt4 * Throw in a few adjectives.
* Alter adjectives from pre-fixed to post-placed (or vice-versa)
* Add in some Adverbs
* Remove some fluff terms
* Alter sentence types (ordinate + subordinate = 2 ordinate, or 2 O's to 1 O + 1 SO)
* Remove/Replace repeat phrasing
* Alter adjectives from pre-fixed to post-placed (or vice-versa)
* Add in some Adverbs
* Remove some fluff terms
* Alter sentence types (ordinate + subordinate = 2 ordinate, or 2 O's to 1 O + 1 SO)
* Remove/Replace repeat phrasing
@GregBernhardt4 The generators build utilising common patterns.
The detectors spot those patterns.
Alter the content, remove the patterns; job done.
The detectors spot those patterns.
Alter the content, remove the patterns; job done.
@GregBernhardt4 Personally, I wouldn't trust on things like:
* punctuation
* spacing
* capitalisation
etc.
One of the first things typically done with NLP based tasks is text normalisation.
This can lead to single-case, single-space, no puctuation/non-word/digit characters etc.
* punctuation
* spacing
* capitalisation
etc.
One of the first things typically done with NLP based tasks is text normalisation.
This can lead to single-case, single-space, no puctuation/non-word/digit characters etc.
@GregBernhardt4 Far better to take the same approaches as those to avoid paraphrase detection,
(which, as well as my earlier suggestions), includes:
* remove sentences
* move sentences
* append/inject sentences
* swap prepositions (word <> phrase)
* shift language level (simple <> complex)
(which, as well as my earlier suggestions), includes:
* remove sentences
* move sentences
* append/inject sentences
* swap prepositions (word <> phrase)
* shift language level (simple <> complex)
@GregBernhardt4 If you ensure that the total
character/word count,
number of sentences/paragraphs,
number of instances of primary words/phrases,
are significantly (15%?) different,
and
the order of content (Sentence/Paragraph/Section) differs
and
there is a % missing
there is a % added
character/word count,
number of sentences/paragraphs,
number of instances of primary words/phrases,
are significantly (15%?) different,
and
the order of content (Sentence/Paragraph/Section) differs
and
there is a % missing
there is a % added
@GregBernhardt4 ... you're usually pretty safe.
Jack and Jill went up the hill
to fetch a pail of water.
Jill went to get a bucket of water,
and Jack accompanied her up the hill.
Jack and Jill went up the hill
to fetch a pail of water.
Jill went to get a bucket of water,
and Jack accompanied her up the hill.
Loading suggestions...