【问题标题】:Randomly Sample a Fixed Length Substring from a Larger String (R)从较大的字符串中随机采样固定长度的子字符串 (R)
【发布时间】:2015-10-11 21:07:51
【问题描述】:

我有一个大约 1000 个字符的长字符串(称为 SuperString),我想从 SuperString 中随机抽取 100 个子字符串。

每个子字符串的长度应为 10 个字符,并且子字符串中的字符应与 SuperString 中的字符顺序相同。

示例:

SuperString = "ADKFKDSLFSDHKENNCNEUNCIEOCIKEMNKSDFU...KJSDLJDFSKLDJSLJ"
substrings = ["FSDHKENNCN", "ADKFKDSLFS", ... ,"OCIKEMNKS"]

【问题讨论】:

  • 也许使用sample() 将调用的起点设置为substr()

标签: r string random-sample


【解决方案1】:
# Create a SuperString
set.seed(87)
SuperString = paste(sample(LETTERS, 1000, replace=TRUE), collapse="")

# Function to sample 10 characters in a row, starting at a random point
# in the string
sampleString = function(string) {
    nStart = sample(1:991,1)
    substr(string, nStart, nStart + 9)
}

# Run the function 100 times
substrings = replicate(100, sampleString(SuperString))

substrings
[1] "VEOUELBFTD" "OPTCIDDNXK" "SFHNKKGOWR" "RVJQYYUSAZ" "MQMBMKCTTI" "ZKLWETGMVR"
[7] "OOXFLGCGPX" "DXAVUMQMBM" "HOORFCFABC" "AMOYPOXXRA" "TGKWKKZUEK" "UYPRPYQCMU" 
...
[91] "RZNSLOBFBK" "FKUKMDUQIK" "YGXDXAVUMQ" "SIRAMRBXSH" "TAILZPHZYS" "OEOUTGKWKK"
[97] "XFLGCGPXKZ" "EDRVJQYYUS" "RHUZLBFNQX" "MUWUODCCFT"

【讨论】:

    猜你喜欢
    • 2012-02-13
    • 2017-05-12
    • 2015-09-13
    • 1970-01-01
    • 2012-02-13
    • 1970-01-01
    • 2021-03-15
    • 2019-01-05
    • 1970-01-01
    相关资源
    最近更新 更多