class.t3lib_parsehtml_proc.php File Reference

Go to the source code of this file.

Namespaces

namespace TYPO3

Classes

class t3lib_parsehtml_proc

Functions

TS_preserve_db ($value)

Preserve special tags.

TS_preserve_rte ($value)

Preserve special tags.

TS_transform_db ($value, $css=FALSE)

Transformation handler: 'ts_transform' + 'css_transform' / direction: "db" Cleaning (->db) for standard content elements (ts).

TS_transform_rte ($value, $css=0)

Transformation handler: 'ts_transform' + 'css_transform' / direction: "rte" Set (->rte) for standard content elements (ts).

TS_strip_db ($value)

Transformation handler: 'ts_strip' / direction: "db" Removing all non-allowed tags.

getURL ($url)

Reads the file or url $url and returns the content.

HTMLcleaner_db ($content, $tagList='')

Function for cleaning content going into the database.

getKeepTags ($direction='rte', $tagList='')

Creates an array of configuration for the HTMLcleaner function based on whether content go TO or FROM the Rich Text Editor ($direction) Unless "tagList" is given, the function will cache the configuration for next time processing goes on.

divideIntoLines ($value, $count=5, $returnArray=FALSE)

This resolves the $value into parts based on
-sections and.

setDivTags ($value, $dT='p')

Converts all lines into
/.

internalizeFontTags ($value)

This splits the $value in font-tag chunks.

siteUrl ()

Returns SiteURL based on thisScript.

rteImageStorageDir ()

Return the storage folder of RTE image files.

removeTables ($value, $breakChar='< br/>')

Remove all tables from incoming code The function is trying to to this is some more or less respectfull way.

defaultTStagMapping ($code, $direction='rte')

Default tag mapping for TS.

getWHFromAttribs ($attribArray)

Finds width and height from attrib-array If the width and height is found in the style-attribute, use that!

urlInfoForLinkTags ($url)

Parse -tag href and return status of email,external,file or page.

TS_AtagToAbs ($value, $dontSetRTEKEEP=FALSE)

Converting -tags to absolute URLs (+ setting rtekeep attribute).

Function Documentation

defaultTStagMapping ( $ code,

$ direction = 'rte'

)

Default tag mapping for TS.

Parameters:

string Input code to process

string Direction To databsae (db) or from database to RTE (rte)

Returns:
string Processed value

Definition at line 1376 of file class.t3lib_parsehtml_proc.php.
Referenced by TS_transform_db().
01376 { 01377 if ($direction=='db') { 01378 $code=$this->mapTags($code,array( // Map tags 01379 'strong' => 'b', 01380 'em' => 'i' 01381 )); 01382 } 01383 if ($direction=='rte') { 01384 $code=$this->mapTags($code,array( // Map tags 01385 'b' => 'strong', 01386 'i' => 'em' 01387 )); 01388 } 01389 return $code; 01390 }

divideIntoLines ( $ value,

$ count = 5,

$ returnArray = FALSE

)

This resolves the $value into parts based on
-sections and.
-sections and <br />-tags. These are returned as lines separated by chr(10). This point is to resolve the HTML-code returned from RTE into ordinary lines so it's 'human-readable' The function ->setDivTags does the opposite. This function processes content to go into the database.

Parameters:

string Value to process.

integer Recursion brake. Decremented on each recursion down to zero. Default is 5 (which equals the allowed nesting levels of p/div tags).

boolean If true, an array with the lines is returned, otherwise a string of the processed input value.

Returns:
string Processed input value.

See also:
setDivTags()

Definition at line 1137 of file class.t3lib_parsehtml_proc.php.
References HTMLcleaner_db(), and internalizeFontTags().
Referenced by TS_transform_db().
01137 { 01138 01139 // Internalize font tags (move them from OUTSIDE p/div to inside it that is the case): 01140 if ($this->procOptions['internalizeFontTags']) {$value = $this->internalizeFontTags($value);} 01141 01142 // Setting configuration for processing: 01143 $allowTagsOutside = t3lib_div::trimExplode(',',strtolower($this->procOptions['allowTagsOutside']?$this->procOptions['allowTagsOutside']:'img'),1); 01144 $remapParagraphTag = strtoupper($this->procOptions['remapParagraphTag']); 01145 $divSplit = $this->splitIntoBlock('div,p',$value,1); // Setting the third param to 1 will eliminate false end-tags. Maybe this is a good thing to do...? 01146 01147 if ($this->procOptions['keepPDIVattribs']) { 01148 $keepAttribListArr = t3lib_div::trimExplode(',',strtolower($this->procOptions['keepPDIVattribs']),1); 01149 } else { 01150 $keepAttribListArr = array(); 01151 } 01152 01153 // Returns plainly the value if there was no div/p sections in it 01154 if (count($divSplit)<=1 || $count<=0) { 01155 return $value; 01156 } 01157 01158 // Traverse the splitted sections: 01159 foreach($divSplit as $k => $v) { 01160 if ($k%2) { // Inside 01161 $v=$this->removeFirstAndLastTag($v); 01162 01163 // Fetching 'sub-lines' - which will explode any further p/div nesting... 01164 $subLines = $this->divideIntoLines($v,$count-1,1); 01165 if (is_array($subLines)) { // So, if there happend to be sub-nesting of p/div, this is written directly as the new content of THIS section. (This would be considered 'an error') 01166 // No noting. 01167 } else { //... but if NO subsection was found, we process it as a TRUE line without erronous content: 01168 $subLines = array($subLines); 01169 if (!$this->procOptions['dontConvBRtoParagraph']) { // process break-tags, if configured for. Simply, the breaktags will here be treated like if each was a line of content... 01170 $subLines = spliti('<br[[:space:]]*[\/]?>',$v); 01171 } 01172 01173 // Traverse sublines (there is typically one, except if <br/> has been converted to lines as well!) 01174 reset($subLines); 01175 while(list($sk)=each($subLines)) { 01176 01177 // Clear up the subline for DB. 01178 $subLines[$sk]=$this->HTMLcleaner_db($subLines[$sk]); 01179 01180 // Get first tag, attributes etc: 01181 $fTag = $this->getFirstTag($divSplit[$k]); 01182 $tagName=strtolower($this->getFirstTagName($divSplit[$k])); 01183 $attribs=$this->get_tag_attributes($fTag); 01184 01185 // Keep attributes (lowercase) 01186 $newAttribs=array(); 01187 if (count($keepAttribListArr)) { 01188 foreach($keepAttribListArr as $keepA) { 01189 if (isset($attribs[0][$keepA])) { $newAttribs[$keepA] = $attribs[0][$keepA]; } 01190 } 01191 } 01192 01193 // ALIGN attribute: 01194 if (!$this->procOptions['skipAlign'] && strcmp(trim($attribs[0]['align']),'') && strtolower($attribs[0]['align'])!='left') { // Set to value, but not 'left' 01195 $newAttribs['align']=strtolower($attribs[0]['align']); 01196 } 01197 01198 // CLASS attribute: 01199 if (!$this->procOptions['skipClass'] && strcmp(trim($attribs[0]['class']),'')) { // Set to whatever value 01200 if (!count($this->allowedClasses) || in_array(strtoupper($attribs[0]['class']),$this->allowedClasses)) { 01201 $newAttribs['class']=$attribs[0]['class']; 01202 } 01203 } 01204 01205 // Remove any line break char (10 or 13) 01206 $subLines[$sk]=ereg_replace(chr(10).'|'.chr(13),'',$subLines[$sk]); 01207 01208 // If there are any attributes or if we are supposed to remap the tag, then do so: 01209 if (count($newAttribs) && strcmp($remapParagraphTag,'1')) { 01210 if ($remapParagraphTag=='P') $tagName='p'; 01211 if ($remapParagraphTag=='DIV') $tagName='div'; 01212 $subLines[$sk]='<'.trim($tagName.' '.$this->compileTagAttribs($newAttribs)).'>'.$subLines[$sk].'</'.$tagName.'>'; 01213 } 01214 } 01215 } 01216 // Add the processed line(s) 01217 $divSplit[$k] = implode(chr(10),$subLines); 01218 01219 // If it turns out the line is just blank (containing a   possibly) then just make it pure blank: 01220 if (trim(strip_tags($divSplit[$k]))==' ') $divSplit[$k]=''; 01221 } else { // outside div: 01222 // Remove positions which are outside div/p tags and without content 01223 $divSplit[$k]=trim(strip_tags($divSplit[$k],'<'.implode('><',$allowTagsOutside).'>')); 01224 if (!strcmp($divSplit[$k],'')) unset($divSplit[$k]); // Remove part if it's empty 01225 } 01226 } 01227 01228 // Return value: 01229 return $returnArray ? $divSplit : implode(chr(10),$divSplit); 01230 }

getKeepTags ( $ direction = 'rte',

$ tagList = ''

)

Creates an array of configuration for the HTMLcleaner function based on whether content go TO or FROM the Rich Text Editor ($direction) Unless "tagList" is given, the function will cache the configuration for next time processing goes on.
(In this class that is the case only if we are processing a bulletlist)

Parameters:

string The direction of the content being processed by the output configuration; "db" (content going into the database FROM the rte) or "rte" (content going into the form)

string Comma list of tags to keep (overriding default which is to keep all + take notice of internal configuration)

Returns:
array Configuration array

See also:
HTMLcleaner_db()

Definition at line 1028 of file class.t3lib_parsehtml_proc.php.
Referenced by HTMLcleaner_db(), and setDivTags().
01028 { 01029 if (!is_array($this->getKeepTags_cache[$direction]) || $tagList) { 01030 01031 // Setting up allowed tags: 01032 if (strcmp($tagList,'')) { // If the $tagList input var is set, this will take precedence 01033 $keepTags = array_flip(t3lib_div::trimExplode(',',$tagList,1)); 01034 } else { // Default is to get allowed/denied tags from internal array of processing options: 01035 // Construct default list of tags to keep: 01036 $typoScript_list = 'b,i,u,a,img,br,div,center,pre,font,hr,sub,sup,p,strong,em,li,ul,ol,blockquote,strike,span'; 01037 $keepTags = array_flip(t3lib_div::trimExplode(',',$typoScript_list.','.strtolower($this->procOptions['allowTags']),1)); 01038 01039 // For tags to deny, remove them from $keepTags array: 01040 $denyTags = t3lib_div::trimExplode(',',$this->procOptions['denyTags'],1); 01041 foreach($denyTags as $dKe) { 01042 unset($keepTags[$dKe]); 01043 } 01044 } 01045 01046 // Based on the direction of content, set further options: 01047 switch ($direction) { 01048 01049 // GOING from database to Rich Text Editor: 01050 case 'rte': 01051 // Transform bold/italics tags to strong/em 01052 if (isset($keepTags['b'])) {$keepTags['b']=array('remap'=>'STRONG');} 01053 if (isset($keepTags['i'])) {$keepTags['i']=array('remap'=>'EM');} 01054 01055 // Transforming keepTags array so it can be understood by the HTMLcleaner function. This basically converts the format of the array from TypoScript (having .'s) to plain multi-dimensional array. 01056 list($keepTags) = $this->HTMLparserConfig($this->procOptions['HTMLparser_rte.'],$keepTags); 01057 break; 01058 01059 // GOING from RTE to database: 01060 case 'db': 01061 // Transform strong/em back to bold/italics: 01062 if (isset($keepTags['strong'])) { $keepTags['strong']=array('remap'=>'b'); } 01063 if (isset($keepTags['em'])) { $keepTags['em']=array('remap'=>'i'); } 01064 01065 // Setting up span tags if they are allowed: 01066 if (isset($keepTags['span'])) { 01067 $classes=array_merge(array(''),$this->allowedClasses); 01068 $keepTags['span']=array( 01069 'allowedAttribs'=>'class', 01070 'fixAttrib' => Array( 01071 'class' => Array ( 01072 'list' => $classes, 01073 'removeIfFalse' => 1 01074 ) 01075 ), 01076 'rmTagIfNoAttrib' => 1 01077 ); 01078 if (!$this->procOptions['allowedClasses']) unset($keepTags['span']['fixAttrib']['class']['list']); 01079 } 01080 01081 // Setting up font tags if they are allowed: 01082 if (isset($keepTags['font'])) { 01083 $colors=array_merge(array(''),t3lib_div::trimExplode(',',$this->procOptions['allowedFontColors'],1)); 01084 $keepTags['font']=array( 01085 'allowedAttribs'=>'face,color,size', 01086 'fixAttrib' => Array( 01087 'face' => Array ( 01088 'removeIfFalse' => 1 01089 ), 01090 'color' => Array ( 01091 'removeIfFalse' => 1, 01092 'list'=>$colors 01093 ), 01094 'size' => Array ( 01095 'removeIfFalse' => 1, 01096 ) 01097 ), 01098 'rmTagIfNoAttrib' => 1 01099 ); 01100 if (!$this->procOptions['allowedFontColors']) unset($keepTags['font']['fixAttrib']['color']['list']); 01101 } 01102 01103 // Setting further options, getting them from the processiong options: 01104 $TSc = $this->procOptions['HTMLparser_db.']; 01105 if (!$TSc['globalNesting']) $TSc['globalNesting']='b,i,u,a,center,font,sub,sup,strong,em,strike,span'; 01106 if (!$TSc['noAttrib']) $TSc['noAttrib']='b,i,u,br,center,hr,sub,sup,strong,em,li,ul,ol,blockquote,strike'; 01107 01108 // Transforming the array from TypoScript to regular array: 01109 list($keepTags) = $this->HTMLparserConfig($TSc,$keepTags); 01110 break; 01111 } 01112 01113 // Caching (internally, in object memory) the result unless tagList is set: 01114 if (!$tagList) { 01115 $this->getKeepTags_cache[$direction] = $keepTags; 01116 } else { 01117 return $keepTags; 01118 } 01119 } 01120 01121 // Return result: 01122 return $this->getKeepTags_cache[$direction]; 01123 }

getURL ( $ url )

Reads the file or url $url and returns the content.

Parameters:

string Filepath/URL to read

Returns:
string The content from the resource given as input.

See also:
t3lib_div::getURL()

Definition at line 993 of file class.t3lib_parsehtml_proc.php.
Referenced by t3lib_htmlmail::fetchHTML(), and t3lib_htmlmail::getExtendedURL().
00993 { 00994 return t3lib_div::getURL($url); 00995 }

getWHFromAttribs ( $ attribArray )

Finds width and height from attrib-array If the width and height is found in the style-attribute, use that!

Parameters:

array Array of attributes from tag in which to search. More specifically the content of the key "style" is used to extract "width:xxx / height:xxx" information

Returns:
array Integer w/h in key 0/1. Zero is returned if not found.

Definition at line 1399 of file class.t3lib_parsehtml_proc.php.
Referenced by t3lib_parsehtml_proc::TS_images_db().
01399 { 01400 $style =trim($attribArray['style']); 01401 if ($style) { 01402 $regex='[[:space:]]*:[[:space:]]*([0-9]*)[[:space:]]*px'; 01403 // Width 01404 eregi('width'.$regex,$style,$reg); 01405 $w = intval($reg[1]); 01406 // Height 01407 eregi('height'.$regex,$style,$reg); 01408 $h = intval($reg[1]); 01409 } 01410 if (!$w) { 01411 $w = $attribArray['width']; 01412 } 01413 if (!$h) { 01414 $h = $attribArray['height']; 01415 } 01416 return array(intval($w),intval($h)); 01417 }

HTMLcleaner_db ( $ content,

$ tagList = ''

)

Function for cleaning content going into the database.
Content is cleaned eg. by removing unallowed HTML and ds-HSC content It is basically calling HTMLcleaner from the parent class with some preset configuration specifically set up for cleaning content going from the RTE into the db

Parameters:

string Content to clean up

string Comma list of tags to specifically allow. Default comes from getKeepTags and is ""

Returns:
string Clean content

See also:
getKeepTags()

Definition at line 1007 of file class.t3lib_parsehtml_proc.php.
References getKeepTags().
Referenced by divideIntoLines(), and TS_transform_db().
01007 { 01008 if (!$tagList) { 01009 $keepTags = $this->getKeepTags('db'); 01010 } else { 01011 $keepTags = $this->getKeepTags('db',$tagList); 01012 } 01013 $kUknown = $this->procOptions['dontRemoveUnknownTags_db'] ? 1 : 0; // Default: remove unknown tags. 01014 $hSC = $this->procOptions['dontUndoHSC_db'] ? 0 : -1; // Default: re-convert literals to characters (that is < to <) 01015 01016 return $this->HTMLcleaner($content,$keepTags,$kUknown,$hSC); 01017 }

internalizeFontTags ( $ value )

This splits the $value in font-tag chunks.
If there are any
/
sections inside of them, the font-tag is wrapped AROUND the content INSIDE of the P/DIV sections and the outer font-tag is removed. This functions seems to be a good choice for pre-processing content if it has been pasted into the RTE from eg. star-office. In that case the font-tags are normally on the OUTSIDE of the sections. This function is used by eg. divideIntoLines() if the procesing option 'internalizeFontTags' is set.

Parameters:

string Input content

Returns:
string Output content

See also:
divideIntoLines()

Definition at line 1286 of file class.t3lib_parsehtml_proc.php.
Referenced by divideIntoLines().
01286 { 01287 01288 // Splitting into font tag blocks: 01289 $fontSplit = $this->splitIntoBlock('font',$value); 01290 01291 foreach($fontSplit as $k => $v) { 01292 if ($k%2) { // Inside 01293 $fTag = $this->getFirstTag($v); // Fint font-tag 01294 01295 $divSplit_sub = $this->splitIntoBlock('div,p',$this->removeFirstAndLastTag($v),1); 01296 if (count($divSplit_sub)>1) { // If there were div/p sections inside the font-tag, do something about it... 01297 // traverse those sections: 01298 foreach($divSplit_sub as $k2 => $v2) { 01299 if ($k2%2) { // Inside 01300 $div_p = $this->getFirstTag($v2); // Fint font-tag 01301 $div_p_tagname = $this->getFirstTagName($v2); // Fint font-tag 01302 $v2=$this->removeFirstAndLastTag($v2); // ... and remove it from original. 01303 $divSplit_sub[$k2]=$div_p.$fTag.$v2.'</font>'.'</'.$div_p_tagname.'>'; 01304 } elseif (trim(strip_tags($v2))) { 01305 $divSplit_sub[$k2]=$fTag.$v2.'</font>'; 01306 } 01307 } 01308 $fontSplit[$k]=implode('',$divSplit_sub); 01309 } 01310 } 01311 } 01312 01313 return implode('',$fontSplit); 01314 }

removeTables ( $ value,

$ breakChar = '<br />'

)

Remove all tables from incoming code The function is trying to to this is some more or less respectfull way.
The approach is to resolve each table cells content and implode it all by <br /> chars. Thus at least the content is preserved in some way.

Parameters:

string Input value

string Break character to use for linebreaks.

Returns:
string Output value

Definition at line 1344 of file class.t3lib_parsehtml_proc.php.
References table().
01344 { 01345 01346 // Splitting value into table blocks: 01347 $tableSplit = $this->splitIntoBlock('table',$value); 01348 01349 // Traverse blocks of tables: 01350 foreach($tableSplit as $k => $v) { 01351 if ($k%2) { 01352 $tableSplit[$k]=''; 01353 $rowSplit = $this->splitIntoBlock('tr',$v); 01354 foreach($rowSplit as $k2 => $v2) { 01355 if ($k2%2) { 01356 $cellSplit = $this->getAllParts($this->splitIntoBlock('td',$v2),1,0); 01357 foreach($cellSplit as $k3 => $v3) { 01358 $tableSplit[$k].=$v3.$breakChar; 01359 } 01360 } 01361 } 01362 } 01363 } 01364 01365 // Implode it all again: 01366 return implode($breakChar,$tableSplit); 01367 }

rteImageStorageDir ( )

Return the storage folder of RTE image files.
Default is $GLOBALS['TYPO3_CONF_VARS']['BE']['RTE_imageStorageDir'] unless something else is configured in the types configuration for the RTE.

Returns:
string

Definition at line 1332 of file class.t3lib_parsehtml_proc.php.
Referenced by t3lib_parsehtml_proc::TS_images_db().
01332 { 01333 return $this->rte_p['imgpath'] ? $this->rte_p['imgpath'] : $GLOBALS['TYPO3_CONF_VARS']['BE']['RTE_imageStorageDir']; 01334 }

setDivTags ( $ value,

$ dT = 'p'

)

Converts all lines into
/.
-sections (unless the line is a div-section already) For processing of content going FROM database TO RTE.

Parameters:

string Value to convert

string Tag to wrap with. Either "p" or "div" should it be. Lowercase preferably.

Returns:
string Processed value.

See also:
divideIntoLines()

Definition at line 1241 of file class.t3lib_parsehtml_proc.php.
References getKeepTags().
Referenced by TS_transform_rte().
01241 { 01242 01243 // First, setting configuration for the HTMLcleaner function. This will process each line between the <div>/<p> section on their way to the RTE 01244 $keepTags = $this->getKeepTags('rte'); 01245 $kUknown = $this->procOptions['dontProtectUnknownTags_rte'] ? 0 : 'protect'; // Default: remove unknown tags. 01246 $hSC = $this->procOptions['dontHSC_rte'] ? 0 : 1; // Default: re-convert literals to characters (that is < to <) 01247 $convNBSP = !$this->procOptions['dontConvAmpInNBSP_rte']?1:0; 01248 01249 // Divide the content into lines, based on chr(10): 01250 $parts = explode(chr(10),$value); 01251 foreach($parts as $k => $v) { 01252 01253 // Processing of line content: 01254 if (!strcmp(trim($parts[$k]),'')) { // If the line is blank, set it to   01255 $parts[$k]=' '; 01256 } else { // Clean the line content: 01257 $parts[$k]=$this->HTMLcleaner($parts[$k],$keepTags,$kUknown,$hSC); 01258 if ($convNBSP) $parts[$k]=str_replace('&nbsp;',' ',$parts[$k]); 01259 } 01260 01261 // Wrapping the line in <$dT> is not already wrapped: 01262 $testStr = strtolower(trim($parts[$k])); 01263 if (substr($testStr,0,4)!='<div' || substr($testStr,-6)!='</div>') { 01264 if (substr($testStr,0,2)!='<p' || substr($testStr,-4)!='</p>') { 01265 // Only set p-tags if there is not already div or p tags: 01266 $parts[$k]='<'.$dT.'>'.$parts[$k].'</'.$dT.'>'; 01267 } 01268 } 01269 } 01270 01271 // Implode result: 01272 return implode(chr(10),$parts); 01273 }

siteUrl ( )

Returns SiteURL based on thisScript.

Returns:
string Value of t3lib_div::getIndpEnv('TYPO3_SITE_URL');

See also:
t3lib_div::getIndpEnv()

Definition at line 1322 of file class.t3lib_parsehtml_proc.php.
Referenced by localFolderTree::SC_browse_links::expandPage(), TS_AtagToAbs(), t3lib_parsehtml_proc::TS_images_db(), t3lib_parsehtml_proc::TS_images_rte(), t3lib_parsehtml_proc::TS_links_db(), t3lib_parsehtml_proc::TS_links_rte(), t3lib_parsehtml_proc::TS_reglinks(), and urlInfoForLinkTags().
01322 { 01323 return t3lib_div::getIndpEnv('TYPO3_SITE_URL'); 01324 }

TS_AtagToAbs ( $ value,

$ dontSetRTEKEEP = FALSE

)

Converting -tags to absolute URLs (+ setting rtekeep attribute).

Parameters:

string Content input

boolean If true, then the "rtekeep" attribute will not be set.

Returns:
string Content output

Definition at line 1484 of file class.t3lib_parsehtml_proc.php.
References siteUrl().
Referenced by t3lib_parsehtml_proc::TS_links_rte(), and t3lib_parsehtml_proc::TS_reglinks().
01484 { 01485 $blockSplit = $this->splitIntoBlock('A',$value); 01486 reset($blockSplit); 01487 while(list($k,$v)=each($blockSplit)) { 01488 if ($k%2) { // block: 01489 $attribArray = $this->get_tag_attributes_classic($this->getFirstTag($v),1); 01490 01491 // Checking if there is a scheme, and if not, prepend the current url. 01492 if (strlen($attribArray['href'])) { // ONLY do this if href has content - the <a> tag COULD be an anchor and if so, it should be preserved... 01493 $uP = parse_url(strtolower($attribArray['href'])); 01494 if (!$uP['scheme']) { 01495 $attribArray['href'] = $this->siteUrl().substr($attribArray['href'],strlen($this->relBackPath)); 01496 } 01497 } else { 01498 $attribArray['rtekeep'] = 1; 01499 } 01500 if (!$dontSetRTEKEEP) $attribArray['rtekeep'] = 1; 01501 01502 $bTag='<a '.t3lib_div::implodeAttributes($attribArray,1).'>'; 01503 $eTag='</a>'; 01504 $blockSplit[$k] = $bTag.$this->TS_AtagToAbs($this->removeFirstAndLastTag($blockSplit[$k])).$eTag; 01505 } 01506 } 01507 return implode('',$blockSplit); 01508 }

TS_preserve_db ( $ value )

Preserve special tags.

Parameters:

string Content input

Returns:
string Content output

Definition at line 735 of file class.t3lib_parsehtml_proc.php.
Referenced by t3lib_parsehtml_proc::RTE_transform().
00735 { 00736 if (!$this->preserveTags) return $value; 00737 00738 // Splitting into blocks for processing (span-tags are used for special tags) 00739 $blockSplit = $this->splitIntoBlock('span',$value); 00740 foreach($blockSplit as $k => $v) { 00741 if ($k%2) { // block: 00742 $attribArray=$this->get_tag_attributes_classic($this->getFirstTag($v)); 00743 if ($attribArray['specialtag']) { 00744 $theTag = rawurldecode($attribArray['specialtag']); 00745 $theTagName = $this->getFirstTagName($theTag); 00746 $blockSplit[$k] = $theTag.$this->removeFirstAndLastTag($blockSplit[$k]).'</'.$theTagName.'>'; 00747 } 00748 } 00749 } 00750 return implode('',$blockSplit); 00751 }

TS_preserve_rte ( $ value )

Preserve special tags.

Parameters:

string Content input

Returns:
string Content output

Definition at line 759 of file class.t3lib_parsehtml_proc.php.
Referenced by t3lib_parsehtml_proc::RTE_transform().
00759 { 00760 if (!$this->preserveTags) return $value; 00761 00762 $blockSplit = $this->splitIntoBlock($this->preserveTags,$value); 00763 foreach($blockSplit as $k => $v) { 00764 if ($k%2) { // block: 00765 $blockSplit[$k] = '<span specialtag="'.rawurlencode($this->getFirstTag($v)).'">'.$this->removeFirstAndLastTag($blockSplit[$k]).'</span>'; 00766 } 00767 } 00768 return implode('',$blockSplit); 00769 }

TS_strip_db ( $ value )

Transformation handler: 'ts_strip' / direction: "db" Removing all non-allowed tags.

Parameters:

string Content input

Returns:
string Content output

Definition at line 962 of file class.t3lib_parsehtml_proc.php.
Referenced by t3lib_parsehtml_proc::RTE_transform().
00962 { 00963 $value = strip_tags($value,'<'.implode('><',explode(',','b,i,u,a,img,br,div,center,pre,font,hr,sub,sup,p,strong,em,li,ul,ol,blockquote')).'>'); 00964 return $value; 00965 }

TS_transform_db ( $ value,

$ css = FALSE

)

Transformation handler: 'ts_transform' + 'css_transform' / direction: "db" Cleaning (->db) for standard content elements (ts).

Parameters:

string Content input

boolean If true, the transformation was "css_transform", otherwise "ts_transform"

Returns:
string Content output

See also:
TS_transform_rte()

Definition at line 780 of file class.t3lib_parsehtml_proc.php.
References defaultTStagMapping(), divideIntoLines(), HTMLcleaner_db(), and table().
Referenced by t3lib_parsehtml_proc::RTE_transform().
00780 { 00781 00782 // safety... so forever loops are avoided (they should not occur, but an error would potentially do this...) 00783 $this->TS_transform_db_safecounter--; 00784 if ($this->TS_transform_db_safecounter<0) return $value; 00785 00786 // Split the content from RTE by the occurence of these blocks: 00787 $blockSplit = $this->splitIntoBlock('TABLE,BLOCKQUOTE,'.$this->headListTags,$value); 00788 00789 $cc=0; 00790 $aC = count($blockSplit); 00791 00792 // Traverse the blocks 00793 foreach($blockSplit as $k => $v) { 00794 $cc++; 00795 $lastBR = $cc==$aC ? '' : chr(10); 00796 00797 if ($k%2) { // Inside block: 00798 00799 // Init: 00800 $tag=$this->getFirstTag($v); 00801 $tagName=strtolower($this->getFirstTagName($v)); 00802 00803 // Process based on the tag: 00804 switch($tagName) { 00805 case 'blockquote': // Keep blockquotes, but clean the inside recursively in the same manner as the main code 00806 $blockSplit[$k]='<'.$tagName.'>'.$this->TS_transform_db($this->removeFirstAndLastTag($blockSplit[$k]),$css).'</'.$tagName.'>'.$lastBR; 00807 break; 00808 case 'ol': 00809 case 'ul': // Transform lists into <typolist>-tags: 00810 if (!$css) { 00811 if (!isset($this->procOptions['typolist']) || $this->procOptions['typolist']) { 00812 $parts = $this->getAllParts($this->splitIntoBlock('LI',$this->removeFirstAndLastTag($blockSplit[$k])),1,0); 00813 while(list($k2)=each($parts)) { 00814 $parts[$k2]=ereg_replace(chr(10).'|'.chr(13),'',$parts[$k2]); // remove all linesbreaks! 00815 $parts[$k2]=$this->defaultTStagMapping($parts[$k2],'db'); 00816 $parts[$k2]=$this->cleanFontTags($parts[$k2],0,0,0); 00817 $parts[$k2] = $this->HTMLcleaner_db($parts[$k2],strtolower($this->procOptions['allowTagsInTypolists']?$this->procOptions['allowTagsInTypolists']:'br,font,b,i,u,a,img,span,strong,em')); 00818 } 00819 if ($tagName=='ol') { $params=' type="1"'; } else { $params=''; } 00820 $blockSplit[$k]='<typolist'.$params.'>'.chr(10).implode(chr(10),$parts).chr(10).'</typolist>'.$lastBR; 00821 } 00822 } else { 00823 $blockSplit[$k].=$lastBR; 00824 } 00825 break; 00826 case 'table': // Tables are NOT allowed in any form (unless preserveTables is set or CSS is the mode) 00827 if (!$this->procOptions['preserveTables'] && !$css) { 00828 $blockSplit[$k]=$this->TS_transform_db($this->removeTables($blockSplit[$k])); 00829 } else { 00830 $blockSplit[$k]=str_replace(chr(10),'',$blockSplit[$k]).$lastBR; 00831 } 00832 break; 00833 case 'h1': 00834 case 'h2': 00835 case 'h3': 00836 case 'h4': 00837 case 'h5': 00838 case 'h6': 00839 if (!$css) { 00840 $attribArray=$this->get_tag_attributes_classic($tag); 00841 // Processing inner content here: 00842 $innerContent = $this->HTMLcleaner_db($this->removeFirstAndLastTag($blockSplit[$k])); 00843 00844 if (!isset($this->procOptions['typohead']) || $this->procOptions['typohead']) { 00845 $type = intval(substr($tagName,1)); 00846 $blockSplit[$k]='<typohead'. 00847 ($type!=6?' type="'.$type.'"':''). 00848 ($attribArray['align']?' align="'.$attribArray['align'].'"':''). 00849 ($attribArray['class']?' class="'.$attribArray['class'].'"':''). 00850 '>'. 00851 $innerContent. 00852 '</typohead>'. 00853 $lastBR; 00854 } else { 00855 $blockSplit[$k]='<'.$tagName. 00856 ($attribArray['align']?' align="'.htmlspecialchars($attribArray['align']).'"':''). 00857 ($attribArray['class']?' class="'.htmlspecialchars($attribArray['class']).'"':''). 00858 '>'. 00859 $innerContent. 00860 '</'.$tagName.'>'. 00861 $lastBR; 00862 } 00863 } else { 00864 $blockSplit[$k].=$lastBR; 00865 } 00866 break; 00867 default: 00868 $blockSplit[$k].=$lastBR; 00869 break; 00870 } 00871 } else { // NON-block: 00872 if (strcmp(trim($blockSplit[$k]),'')) { 00873 $blockSplit[$k]=$this->divideIntoLines($blockSplit[$k]).$lastBR; 00874 } else unset($blockSplit[$k]); 00875 } 00876 } 00877 $this->TS_transform_db_safecounter++; 00878 00879 return implode('',$blockSplit); 00880 }

TS_transform_rte ( $ value,

$ css = 0

)

Transformation handler: 'ts_transform' + 'css_transform' / direction: "rte" Set (->rte) for standard content elements (ts).

Parameters:

string Content input

boolean If true, the transformation was "css_transform", otherwise "ts_transform"

Returns:
string Content output

See also:
TS_transform_db()

Definition at line 891 of file class.t3lib_parsehtml_proc.php.
References setDivTags().
Referenced by t3lib_parsehtml_proc::RTE_transform().
00891 { 00892 00893 // Split the content from Database by the occurence of these blocks: 00894 $blockSplit = $this->splitIntoBlock('TABLE,BLOCKQUOTE,TYPOLIST,TYPOHEAD,'.$this->headListTags,$value); 00895 00896 // Traverse the blocks 00897 foreach($blockSplit as $k => $v) { 00898 if ($k%2) { // Inside one of the blocks: 00899 00900 // Init: 00901 $tag = $this->getFirstTag($v); 00902 $tagName = strtolower($this->getFirstTagName($v)); 00903 $attribArray = $this->get_tag_attributes_classic($tag); 00904 00905 // Based on tagname, we do transformations: 00906 switch($tagName) { 00907 case 'blockquote': // Keep blockquotes: 00908 $blockSplit[$k] = $tag. 00909 $this->TS_transform_rte($this->removeFirstAndLastTag($blockSplit[$k]),$css). 00910 '</'.$tagName.'>'; 00911 break; 00912 case 'typolist': // Transform typolist blocks into OL/UL lists. Type 1 is expected to be numerical block 00913 if (!isset($this->procOptions['typolist']) || $this->procOptions['typolist']) { 00914 $tListContent = $this->removeFirstAndLastTag($blockSplit[$k]); 00915 $tListContent = ereg_replace('^[ ]*'.chr(10),'',$tListContent); 00916 $tListContent = ereg_replace(chr(10).'[ ]*$','',$tListContent); 00917 $lines = explode(chr(10),$tListContent); 00918 $typ = $attribArray['type']==1 ? 'ol' : 'ul'; 00919 $blockSplit[$k] = '<'.$typ.'>'.chr(10). 00920 '<li>'.implode('</li>'.chr(10).'<li>',$lines).'</li>'. 00921 '</'.$typ.'>'; 00922 } 00923 break; 00924 case 'typohead': // Transform typohead into Hx tags. 00925 if (!isset($this->procOptions['typohead']) || $this->procOptions['typohead']) { 00926 $tC = $this->removeFirstAndLastTag($blockSplit[$k]); 00927 $typ = t3lib_div::intInRange($attribArray['type'],0,6); 00928 if (!$typ) $typ=6; 00929 $align = $attribArray['align']?' align="'.$attribArray['align'].'"': ''; 00930 $class = $attribArray['class']?' class="'.$attribArray['class'].'"': ''; 00931 $blockSplit[$k] = '<h'.$typ.$align.$class.'>'. 00932 $tC. 00933 '</h'.$typ.'>'; 00934 } 00935 break; 00936 } 00937 $blockSplit[$k+1] = ereg_replace('^[ ]*'.chr(10),'',$blockSplit[$k+1]); // Removing linebreak if typohead 00938 } else { // NON-block: 00939 $nextFTN = $this->getFirstTagName($blockSplit[$k+1]); 00940 $singleLineBreak = $blockSplit[$k]==chr(10); 00941 if (t3lib_div::inList('TABLE,BLOCKQUOTE,TYPOLIST,TYPOHEAD,'.$this->headListTags,$nextFTN)) { // Removing linebreak if typolist/typohead 00942 $blockSplit[$k] = ereg_replace(chr(10).'[ ]*$','',$blockSplit[$k]); 00943 } 00944 // If $blockSplit[$k] is blank then unset the line. UNLESS the line happend to be a single line break. 00945 if (!strcmp($blockSplit[$k],'') && !$singleLineBreak) { 00946 unset($blockSplit[$k]); 00947 } else { 00948 $blockSplit[$k] = $this->setDivTags($blockSplit[$k],($this->procOptions['useDIVasParagraphTagForRTE']?'div':'p')); 00949 } 00950 } 00951 } 00952 return implode(chr(10),$blockSplit); 00953 }

urlInfoForLinkTags ( $ url )

Parse -tag href and return status of email,external,file or page.

Parameters:

string URL to analyse.

Returns:
array Information in an array about the URL

Definition at line 1425 of file class.t3lib_parsehtml_proc.php.
References $a, siteUrl(), and TYPO3_mainDir.
Referenced by t3lib_parsehtml_proc::TS_links_db().
01425 { 01426 $info = array(); 01427 $url = trim($url); 01428 if (substr(strtolower($url),0,7)=='mailto:') { 01429 $info['url']=trim(substr($url,7)); 01430 $info['type']='email'; 01431 } else { 01432 $curURL = $this->siteUrl(); // 100502, removed this: 'http://'.t3lib_div::getThisUrl(); Reason: The url returned had typo3/ in the end - should be only the site's url as far as I see... 01433 for($a=0;$a<strlen($url);$a++) { 01434 if ($url[$a]!=$curURL[$a]) { 01435 break; 01436 } 01437 } 01438 01439 $info['relScriptPath']=substr($curURL,$a); 01440 $info['relUrl']=substr($url,$a); 01441 $info['url']=$url; 01442 $info['type']='ext'; 01443 01444 $siteUrl_parts = parse_url($url); 01445 $curUrl_parts = parse_url($curURL); 01446 01447 if ($siteUrl_parts['host']==$curUrl_parts['host'] // Hosts should match 01448 && (!$info['relScriptPath'] || (defined('TYPO3_mainDir') && substr($info['relScriptPath'],0,strlen(TYPO3_mainDir))==TYPO3_mainDir))) { // If the script path seems to match or is empty (FE-EDIT) 01449 01450 // New processing order 100502 01451 $uP=parse_url($info['relUrl']); 01452 01453 if (!strcmp('#'.$siteUrl_parts['fragment'],$info['relUrl'])) { 01454 $info['url']=$info['relUrl']; 01455 $info['type']='anchor'; 01456 } elseif (!trim($uP['path']) || !strcmp($uP['path'],'index.php')) { 01457 $pp = explode('id=',$uP['query']); 01458 $id = trim($pp[1]); 01459 if ($id) { 01460 $info['pageid']=$id; 01461 $info['cElement']=$uP['fragment']; 01462 $info['url']=$id.($info['cElement']?'#'.$info['cElement']:''); 01463 $info['type']='page'; 01464 } 01465 } else { 01466 $info['url']=$info['relUrl']; 01467 $info['type']='file'; 01468 } 01469 } else { 01470 unset($info['relScriptPath']); 01471 unset($info['relUrl']); 01472 } 01473 } 01474 return $info; 01475 }

Generated on Sun Oct 3 01:06:02 2004 for TYPO3core 3.7.0 dev by

1.3.8-20040913


Namespaces
namespace	TYPO3
Classes
class	t3lib_parsehtml_proc
Functions
	TS_preserve_db ($value)
	Preserve special tags.
	TS_preserve_rte ($value)
	Preserve special tags.
	TS_transform_db ($value, $css=FALSE)
	Transformation handler: 'ts_transform' + 'css_transform' / direction: "db" Cleaning (->db) for standard content elements (ts).
	TS_transform_rte ($value, $css=0)
	Transformation handler: 'ts_transform' + 'css_transform' / direction: "rte" Set (->rte) for standard content elements (ts).
	TS_strip_db ($value)
	Transformation handler: 'ts_strip' / direction: "db" Removing all non-allowed tags.
	getURL ($url)
	Reads the file or url $url and returns the content.
	HTMLcleaner_db ($content, $tagList='')
	Function for cleaning content going into the database.
	getKeepTags ($direction='rte', $tagList='')
	Creates an array of configuration for the HTMLcleaner function based on whether content go TO or FROM the Rich Text Editor ($direction) Unless "tagList" is given, the function will cache the configuration for next time processing goes on.
	divideIntoLines ($value, $count=5, $returnArray=FALSE)
	This resolves the $value into parts based on -sections and.
	setDivTags ($value, $dT='p')
	Converts all lines into /.
	internalizeFontTags ($value)
	This splits the $value in font-tag chunks.
	siteUrl ()
	Returns SiteURL based on thisScript.
	rteImageStorageDir ()
	Return the storage folder of RTE image files.
	removeTables ($value, $breakChar='< br/>')
	Remove all tables from incoming code The function is trying to to this is some more or less respectfull way.
	defaultTStagMapping ($code, $direction='rte')
	Default tag mapping for TS.
	getWHFromAttribs ($attribArray)
	Finds width and height from attrib-array If the width and height is found in the style-attribute, use that!
	urlInfoForLinkTags ($url)
	Parse -tag href and return status of email,external,file or page.
	TS_AtagToAbs ($value, $dontSetRTEKEEP=FALSE)
	Converting -tags to absolute URLs (+ setting rtekeep attribute).