PHP
downloads | documentation | faq | getting help | mailing lists | reporting bugs | php.net sites | links | conferences | my php.net

search for in the

show_source> <php_check_syntax
Last updated: Fri, 05 Sep 2008

view this page in

php_strip_whitespace

(PHP 5)

php_strip_whitespaceコメントと空白文字を取り除いたソースを返す

説明

string php_strip_whitespace ( string $filename )

PHP のソースコード filename からコメントと 空白文字を取り除いたものを返します。これは、スクリプトの中で 実際のコードの量がどれくらいなのかを知るのに役立つでしょう。 これは コマンドライン から php -w を実行するのと同じです。

パラメータ

filename

PHP ファイルへのパス。

返り値

成功した場合に処理済みのソースコード、失敗した場合に空の文字列を返します。

注意: PHP 5.0.1 以降、この関数は記述どおりに動作するようになりました。 それまでは単に空の文字列を返すだけでした。このバグについての詳細な情報は、 バグ番号 » 29606 を参照ください。

例1 php_strip_whitespace() の例

<?php
// これは PHP のコメントです

/*
 * これも PHP のコメントです
 */

echo        php_strip_whitespace(__FILE__);
// 改行は空白文字と同じ扱いで、取り除かれます
do_nothing();
?>

上の例の出力は以下となります。

<?php
 echo php_strip_whitespace(__FILE__); do_nothing(); ?>

PHP のコメントが削除されていること、 最初の echo 文の後の改行や空白文字が削除されていることに注目しましょう。



show_source> <php_check_syntax
Last updated: Fri, 05 Sep 2008
 
add a note add a note User Contributed Notes
php_strip_whitespace
gelamu at gmail dot com
10-Apr-2008 11:12
With this function You can compress Your PHP source code.

<?php

function compress_php_src($src) {
   
// Whitespaces left and right from this signs can be ignored
   
static $IW = array(
       
T_CONCAT_EQUAL,             // .=
       
T_DOUBLE_ARROW,             // =>
       
T_BOOLEAN_AND,              // &&
       
T_BOOLEAN_OR,               // ||
       
T_IS_EQUAL,                 // ==
       
T_IS_NOT_EQUAL,             // != or <>
       
T_IS_SMALLER_OR_EQUAL,      // <=
       
T_IS_GREATER_OR_EQUAL,      // >=
       
T_INC,                      // ++
       
T_DEC,                      // --
       
T_PLUS_EQUAL,               // +=
       
T_MINUS_EQUAL,              // -=
       
T_MUL_EQUAL,                // *=
       
T_DIV_EQUAL,                // /=
       
T_IS_IDENTICAL,             // ===
       
T_IS_NOT_IDENTICAL,         // !==
       
T_DOUBLE_COLON,             // ::
       
T_PAAMAYIM_NEKUDOTAYIM,     // ::
       
T_OBJECT_OPERATOR,          // ->
       
T_DOLLAR_OPEN_CURLY_BRACES, // ${
       
T_AND_EQUAL,                // &=
       
T_MOD_EQUAL,                // %=
       
T_XOR_EQUAL,                // ^=
       
T_OR_EQUAL,                 // |=
       
T_SL,                       // <<
       
T_SR,                       // >>
       
T_SL_EQUAL,                 // <<=
       
T_SR_EQUAL,                 // >>=
   
);
    if(
is_file($src)) {
        if(!
$src = file_get_contents($src)) {
            return
false;
        }
    }
   
$tokens = token_get_all($src);
   
   
$new = "";
   
$c = sizeof($tokens);
   
$iw = false; // ignore whitespace
   
$ih = false; // in HEREDOC
   
$ls = "";    // last sign
   
$ot = null// open tag
   
for($i = 0; $i < $c; $i++) {
       
$token = $tokens[$i];
        if(
is_array($token)) {
            list(
$tn, $ts) = $token; // tokens: number, string, line
           
$tname = token_name($tn);
            if(
$tn == T_INLINE_HTML) {
               
$new .= $ts;
               
$iw = false;
            } else {
                if(
$tn == T_OPEN_TAG) {
                    if(
strpos($ts, " ") || strpos($ts, "\n") || strpos($ts, "\t") || strpos($ts, "\r")) {
                       
$ts = rtrim($ts);
                    }
                   
$ts .= " ";
                   
$new .= $ts;
                   
$ot = T_OPEN_TAG;
                   
$iw = true;
                } elseif(
$tn == T_OPEN_TAG_WITH_ECHO) {
                   
$new .= $ts;
                   
$ot = T_OPEN_TAG_WITH_ECHO;
                   
$iw = true;
                } elseif(
$tn == T_CLOSE_TAG) {
                    if(
$ot == T_OPEN_TAG_WITH_ECHO) {
                       
$new = rtrim($new, "; ");
                    } else {
                       
$ts = " ".$ts;
                    }
                   
$new .= $ts;
                   
$ot = null;
                   
$iw = false;
                } elseif(
in_array($tn, $IW)) {
                   
$new .= $ts;
                   
$iw = true;
                } elseif(
$tn == T_CONSTANT_ENCAPSED_STRING
                      
|| $tn == T_ENCAPSED_AND_WHITESPACE)
                {
                    if(
$ts[0] == '"') {
                       
$ts = addcslashes($ts, "\n\t\r");
                    }
                   
$new .= $ts;
                   
$iw = true;
                } elseif(
$tn == T_WHITESPACE) {
                   
$nt = @$tokens[$i+1];
                    if(!
$iw && (!is_string($nt) || $nt == '$') && !in_array($nt[0], $IW)) {
                       
$new .= " ";
                    }
                   
$iw = false;
                } elseif(
$tn == T_START_HEREDOC) {
                   
$new .= "<<<S\n";
                   
$iw = false;
                   
$ih = true; // in HEREDOC
               
} elseif($tn == T_END_HEREDOC) {
                   
$new .= "S;";
                   
$iw = true;
                   
$ih = false; // in HEREDOC
                   
for($j = $i+1; $j < $c; $j++) {
                        if(
is_string($tokens[$j]) && $tokens[$j] == ";") {
                           
$i = $j;
                            break;
                        } else if(
$tokens[$j][0] == T_CLOSE_TAG) {
                            break;
                        }
                    }
                } elseif(
$tn == T_COMMENT || $tn == T_DOC_COMMENT) {
                   
$iw = true;
                } else {
                    if(!
$ih) {
                       
$ts = strtolower($ts);
                    }
                   
$new .= $ts;
                   
$iw = false;
                }
            }
           
$ls = "";
        } else {
            if((
$token != ";" && $token != ":") || $ls != $token) {
               
$new .= $token;
               
$ls = $token;
            }
           
$iw = true;
        }
    }
    return
$new;
}

?>

For example:
<?php

$src
= <<<EOT
<?php
// some comment
for ( $i = 0; $i < 99; $i ++ )
{
   echo "i=
${ i }\n";
   /* ... */
}
/** ... */
function abc()
{
   return   "abc";
};

abc();
?>
<h1><?= "Some text " . str_repeat("_-x-_ ", 32);;; ?></h1>
EOT;
var_dump(compress_php_src($src));
?>

And the result is:
string(125) "<?php for(=0;<99;++){echo "i=\n";}function abc(){return "abc";};abc(); ?>
<h1><?="Some text ".str_repeat("_-x-_ ",32)?></h1>"
Jouni
03-Oct-2007 02:14
If you wish to just remove excess whitespace from a string, see the example "Strip whitespace" in the preg_replace documentation (http://www.php.net/manual/en/function.preg-replace.php).
Zvjezdan Patz
14-Sep-2007 05:14
I was given a report that was separated by spaces and asked to make graphs from it.  I needed to turn the report data into a csv in memory so I could manipulate it further. 

First needed to see the report, then need to strip out the whitespace, but leave one space between each item that I could convert to a column.

There were lots of complicated ways to do this.  I stumbled on something simple.

Say the report looks like this:

Monday    Tuesday    Wednesday    Thursday   Friday   Saturday   Sunday
1               5               7                    8               10         7              8        
7               15             4                    0               21         4              12
9               5               7                    9               0           9              43

The report is using spaces and not tabs to separate everything.  Assume it's a file called data.txt you can use the following to strip out the spaces and make it comma delimited:

<?php

$handle
= @fopen("data.txt", "r");

if (
$handle)
{
  while (!
feof($handle))
  {
   
$buffer = fgets($handle, 4096);
   
// this will search for 5 spaces and replace with 1, then 4, then 3, then 2
    // then only one will be left.  Replace that one space with a comma
    // then output with nl2br so you can see the line breaks

   
print nl2br(str_replace(" ", ",",ereg_replace( '  ', ' ',ereg_replace( '   ', ' ', ereg_replace( '    ', ' ', ereg_replace( '     ',' ',$buffer ))))));
  }
}

fclose($handle);
?>

Hope that helps someone else.
natio at phpfox dot com
06-Sep-2007 08:21
Notice: In my last comment for this function I failed to add some important parts of the function. So I have re-added it here. Feel free to delete my earlier comment. Thanks!
---

To use php_strip_whitespace in (PHP 4 >= 4.2.0) you could try the function below. This function also helps solve the issues with php_strip_whitespace not fully removing new lines and extra whitespace's in HTML when embedded with PHP.

<?php

if (!defined ('T_ML_COMMENT'))
{
   
define ('T_ML_COMMENT', T_COMMENT);
}
if (!
defined ('T_DOC_COMMENT'))
{
   
define ('T_DOC_COMMENT', T_ML_COMMENT);
}

function
StripWhitespace($sFileName)
{
    if ( !
is_file($sFileName) )
    {
        return
false;
    }

   
$sContent = implode('', file($sFileName));

   
$aTokens = token_get_all($sContent);

   
$bLast = false;
   
$sStr = '';
    for (
$i = 0, $j = count($aTokens); $i < $j; $i++ )
    {
        if (
is_string($aTokens[$i]) )
        {
           
$bLast = false;
           
$sStr .= $aTokens[$i];
        }
        else
        {
            switch (
$aTokens[$i][0] )
            {
                case
T_COMMENT:
                case
T_ML_COMMENT:
                case
T_DOC_COMMENT:
                break;
                case
T_WHITESPACE:
                if (!
$bLast)
                {
                   
$sStr .= ' ';
                   
$bLast = true;
                }
                break;
                default:
                    
$bLast = false;
                   
$sStr .= $aTokens[$i][1];
                break;
            }
        }
    }

   
$sStr = trim($sStr);
   
$sStr = str_replace("\n", "", $sStr);
   
$sStr = str_replace("\r", "", $sStr);

    return
$sStr;
}
?>
nyctimus at yahoo dot com
03-May-2007 11:01
Here's one for CSS:

<?php

function css_strip_whitespace($css)
{
 
$replace = array(
   
"#/\*.*?\*/#s" => ""// Strip C style comments.
   
"#\s\s+#"      => " ", // Strip excess whitespace.
 
);
 
$search = array_keys($replace);
 
$css = preg_replace($search, $replace, $css);

 
$replace = array(
   
": "  => ":",
   
"; "  => ";",
   
" {"  => "{",
   
" }"  => "}",
   
", "  => ",",
   
"{ "  => "{",
   
";}"  => "}", // Strip optional semicolons.
   
",\n" => ",", // Don't wrap multiple selectors.
   
"\n}" => "}", // Don't wrap closing braces.
   
"} "  => "}\n", // Put each rule on it's own line.
 
);
 
$search = array_keys($replace);
 
$css = str_replace($search, $replace, $css);

  return
trim($css);
}

?>

A word on the first regular expression, since it took me a while.
It strips C style comments. /* Like this. */

#/\*.*?\*/#s
^         ^
The pound signs at either end quote the regex. They don't match anything.

#/\*.*?\*/#s
           ^
The s at the very end sets the PCRE_DOTALL modifier. More info here:
http://www.php.net/manual/en/reference.pcre.pattern.modifiers.php

#  /\*  .*?  \*/  #s
    1    2    3
The expression itself consists of 3 parts:
1. the opening comment sequence, represented by     /\*
2. everything in the middle, represented by         .*?
3. and the closing comment sequence, represented by \*/

#/\*.*?\*/#s
   ^    ^
The comment asterisks are escaped. If I had used the more common / for PCRE quoting I would've had to escape those too.

#/\*.*?\*/#s
      ^
The ? prevents the regex from being greedy. See halfway down this page:
http://www.php.net/manual/en/reference.pcre.pattern.syntax.php
flconseil at yahoo dot fr
08-Jul-2006 05:57
Beware that this function uses the output buffering mechanism.

If you give a 'stream wrapped' path as argument, anything echoed by the stream wrapper during this call (e.g. trace messages) won't be displayed to the screen but will be inserted in php_strip_whitespace's result.

If you execute this stripped code later, it will display the messages which should have been output during php_strip_whitespace's execution !
mwwaygoo AT hotmail DOT com
27-Apr-2006 06:17
I thought this was a nice function until I realised it wouldnt strip down html. As i'd been reading an article on compressing output to speed up delivery.
So I wrote a little one to do that for me. Here its is, incase people were looking for a html version. It may need tweaking, like with existing &nbsp;'s.

<?php
function strip_html($data)
{
   
// strip unecessary comments and characters from a webpages text
    // all line comments, multi-line comments \\r \\n \\t multi-spaces that make a script readable.
    // it also safeguards enquoted values and values within textareas, as these are required

   
$data=preg_replace_callback("/>[^<]*<\\/textarea/i", "harden_characters", $data);
   
$data=preg_replace_callback("/\\"[^"<>]+\\"/", "harden_characters", $data);

    $data=preg_replace("
/(//.*n)/","",$data); // remove single line comments, like this, from // to \\n
    $data=preg_replace("
/(t|r|n)/","",$data);  // remove new lines \\n, tabs and \\r
    $data=preg_replace("
/(/*.**/)/","",$data);  // remove multi-line comments /* */
    $data=preg_replace("
/(<![^>]*>)/","",$data);  // remove multi-line comments <!-- -->
    $data=preg_replace('/(\\s+)/', ' ',$data); // replace multi spaces with singles
    $data=preg_replace('/>\\s</', '><',$data);

    $data=preg_replace_callback("
/"[^\\"<>]+"/", "unharden_characters", $data);
   
$data=preg_replace_callback("/>[^<]*<\\/textarea/", "unharden_characters", $data);

    return
$data;
}

function
harden_characters($array)
{
   
$safe=$array[0];
   
$safe=preg_replace('/\\n/', "%0A", $safe);
   
$safe=preg_replace('/\\t/', "%09", $safe);
   
$safe=preg_replace('/\\s/', "&nbsp;", $safe);
    return
$safe;
}
function
unharden_characters($array)
{
   
$safe=$array[0];
   
$safe=preg_replace('/%0A/', "\\n", $safe);
   
$safe=preg_replace('/%09/', "\\t", $safe);
   
$safe=preg_replace('/&nbsp;/', " ", $safe);
    return
$safe;
}
?>

The article code was similar to this, which shouldn't work as php_strip_whitespace takes a filename as input:-

<?php
// ob_start(); and output here
$data=ob_get_contents();
ob_end_clean();
if(
strstr($_SERVER['HTTP_ACCEPT_ENCODING'],'gzip'))
{
   
$data=gzencode(php_strip_whitespace($data),9);
   
header('Content-Encoding: gzip');
}
echo
$data;
?>
Cosmo
01-Apr-2006 01:27
With newlines stripped your HEREDOCs won't work.
dnc at seznam dot cz
22-Oct-2005 12:01
This function can not be used to strip comments outside <?php ... ?>

// this comment will not be removed
<?php
// this comment will be removed
?>
amedee at amedee dot be
09-Jul-2005 02:29
Not only can this be used for JavaScript files, but also for:

* Java source code
* CSS (Style Sheets)
* Any file with C-style comments.

show_source> <php_check_syntax
Last updated: Fri, 05 Sep 2008
 
 
show source | credits | sitemap | contact | advertising | mirror sites