【发布时间】:2011-09-03 18:51:46
【问题描述】:
给定一个文本,我如何计算字长的密度/计数,以便得到这样的输出
- 1 个字母的单词:52 / 1%
- 2 个字母的单词:34 / 0.5%
- 3 个字母的单词:67 / 2%
找到这个,但对于 python
【问题讨论】:
给定一个文本,我如何计算字长的密度/计数,以便得到这样的输出
找到这个,但对于 python
【问题讨论】:
您可以先将文本拆分为单词,使用explode() (作为一个非常/太简单的解决方案) 或preg_split() (允许更强大的东西) :
$text = "this is some kind of text with several words";
$words = explode(' ', $text);
然后,使用 strlen() 遍历单词,为每个单词获取其长度;并将这些长度放入一个数组中:
$results = array();
foreach ($words as $word) {
$length = strlen($word);
if (isset($results[$length])) {
$results[$length]++;
}
else {
$results[$length] = 1;
}
}
如果您使用的是 UTF-8,请参阅 mb_strlen()。
在该循环结束时,$results 将如下所示:
array
4 => int 5
2 => int 2
7 => int 1
5 => int 1
您需要计算百分比的总字数可以在以下任一位置找到:
foreach 循环内的计数器,$results 上调用array_sum()。对于百分比的计算,它有点数学——我不会那么有帮助,关于那个^^
【讨论】:
您可以用空格分解文本,然后为每个生成的单词计算字母的数量。如果有标点符号或任何其他单词分隔符,您必须考虑到这一点。
$lettercount = array();
$text = "lorem ipsum dolor sit amet";
foreach (explode(' ', $text) as $word)
{
@$lettercount[strlen($word)]++; // @ for avoiding E_NOTICE on first addition
}
foreach ($lettercount as $numletters => $numwords)
{
echo "$numletters letters: $numwords<br />\n";
}
ps:我还没有证明这一点,但应该可以工作
【讨论】:
您可以更聪明地使用 preg_replace 删除标点符号。
$txt = "Sean Hoare, who was first named News of the World journalist to make hacking allegations, found dead at Watford home. His death is not being treated as suspiciou";
$txt = str_replace( " ", " ", $txt );
$txt = str_replace( ".", "", $txt );
$txt = str_replace( ",", "", $txt );
$a = explode( " ", $txt );
$cnt = array();
foreach ( $a as $b )
{
if ( isset( $cnt[strlen($b)] ) )
$cnt[strlen($b)] += 1;
else
$cnt[strlen($b)] = 1;
}
foreach ( $cnt as $k => $v )
{
echo $k . " letter words: " . $v . " " . round( ( $v * 100 ) / count( $a ) ) . "%\n";
}
【讨论】:
My simple way to limit the number of words characters in some string with php.
function checkWord_len($string, $nr_limit) {
$text_words = explode(" ", $string);
$text_count = count($text_words);
for ($i=0; $i < $text_count; $i++){ //Get the array words from text
// echo $text_words[$i] ; "
//Get the array words from text
$cc = (strlen($text_words[$i])) ;//Get the lenght char of each words from array
if($cc > $nr_limit) //Check the limit
{
$d = "0" ;
}
}
return $d ; //Return the value or null
}
$string_to_check = " heare is your text to check"; //Text to check
$nr_string_limit = '5' ; //Value of limit len word
$rez_fin = checkWord_len($string_to_check,$nr_string_limit) ;
if($rez_fin =='0')
{
echo "false";
//Execute the false code
}
elseif($rez_fin == null)
{
echo "true";
//Execute the true code
}
?>
【讨论】: