help in stripping out html tags

I think the weather alert has changed somewhat and i cant seem to strip out the html tags at the beginning and end of the alert does anyone have any ideas how to fix this

this is what i have in my script

 

resp=StripHTML(resp)    'strip the html tags
            l=len(resp)
            s=instr(resp, "National Weather Service")
            resp=right(resp, l-s-25000)    'chop off the header
            l=len(resp)
            s=instr(resp, "Close this window")
            resp=left(resp, l-s-6000)    'chop off the trailer
        '    resp = Replace(resp, vbNewLine, "")        'get rid of newlines, etc

3,399 views 3 replies
Reply #1 Top

It looks like you have a function (StripHTML) to remove the tags. What's that look like? Also, are you modifying an existing widget? If so, which one?

For what it's worth, here's the function I use to remove all html tags.

Function removeXstuff(stuff)
 'Remove everything between angle brackets from the extracted info
  xstuffL = Instr(1, stuff, "<")
 
 While xstuffL <> 0
   xstuffR = Instr(xstuffL, stuff, ">")+ 1
   xstuffLen = xstuffR - xstuffL
   xstuff = Mid(stuff, xstuffL, xstuffLen)
   stuff = Replace(stuff, xstuff, "")
   xStart= xstuffR - 1
   xstuffL = Instr(xStart, stuff, "<")
 Wend
 
 removeXstuff = stuff
End Function

It works for the most part. Once or twice I've had to run the info through the function a second time to get some straggler tags out.

Reply #2 Top

yes i am modifying an existing widget LookingGlass Weather suite.  I have already figured out the new weather website which had changed.  and now it keeps showing all the extra html tags with the l-s-6000 i was finally able to get the trailing tags off but cannot seem to get the header tags off this is everything that i see there

 

'********** Get Severe Weather Alert Data
Function GetWeatherAlert()

    AlertWidth = 50
    Randomize
    Set http = CreateObject("MSXML2.ServerXMLHTTP")

    alert = 0
    final=""
    finalraw=""
    'html="<basefont size=2>"
    fontprep="<span style='font-size:10.0pt;font-family:Arial'>"
    colorprep="<body bgcolor=" & Chr(34) & "4d4d4d" & Chr(34) & "text="  & Chr(34) & "ffffff"  & Chr(34) & ">"
    scrollbarprep = "<style Type=" & Chr(34) & "text/css" & Chr(34) & ">body,html {scrollbar-arrow-color: white;scrollbar-Base-color: #4d4d4d;scrollbar-3dlight-color:black;scrollbar-highlight-color: #4d4d4d;scrollbar-face-color: #4d4d4d;scrollbar-shadow-color: black;scrollbar-darkshadow-color: black;}</style>"
    html = fontprep & colorprep & scrollbarprep
    Do until NewSWA(alert,0)=""
            http.Open "GET", "http://www.weather.com/weather/alerts/?alertId=" & NewSWA(alert,0) & "&dbSeq=null&cameFrom=national"
            Call http.Send()
          On Error Resume Next 
          'Wait for up to 3 seconds if we've not gotten the data yet
          If http.readyState <> 4 Then
            http.waitForResponse 3
          End If
          'Did an error occur?
            If Err.Number = 0 Then
                If (http.readyState <> 4) Or (http.Status <> 200) Then
              'Abort the XMLHttp request
              http.Abort
                  resp = Object.PersistStorage("notfound")
            Else
              resp = http.ResponseText
            End If
            End If
            resp=StripHTML(resp)    'strip the html tags
            l=len(resp)
            s=instr(resp, "National Weather Service")
            resp=right(resp, s-5000)    'chop off the header
            l=len(resp)
            s=instr(resp, "Close this window")
            resp=left(resp, l-s-6000)    'chop off the trailer
        '    resp = Replace(resp, vbNewLine, "")        'get rid of newlines, etc
            resp = Replace(resp, Chr(7) & " ", Chr(7))
            ThisAlert=Reformat(resp,AlertWidth)
            head = Reformat(NewSWA(alert,1),AlertWidth) & vbNewLine & "----------" & vbNewLine
            htmlhead = "<P align=center><B>" & Reformat(NewSWA(alert,1),AlertWidth) & "</B><BR>----------<BR></P>"
            If alert>0 Then head = vbNewLine & String(AlertWidth+2,"=") & vbNewLine & head
            If alert>0 Then htmlhead = "<BR><P align=center><B>" & String(AlertWidth+2,"=") & "</B></P>" & htmlhead
            final = final & head & ThisAlert
            html = html & htmlhead & resp
            alert = alert + 1
    Loop

    html = replace(html, Chr(7), "<P>") & "</span>"
    DesktopX.ScriptObject("AlertHTML").Control.Navigate "about:blank"
    object.sleep 75
    DesktopX.ScriptObject("AlertHTML").Control.Document.Write html
    DesktopX.ScriptObject("AlertHTML").Control.Refresh

    GetWeatherAlert=final
    Set http = Nothing
    Set resp = Nothing
   
End Function

 

thank you for responding

 

I also found this down a little more in my script

 

'********** Strip HTML tags from a string, along with tabs and newlines
Function stripHTML(HTMLstring)
    HTMLstring = Replace(HTMLstring, vbTab, "")
    HTMLString = Replace(HTMLstring, vbCR, "")
    HTMLstring = Replace(HTMLstring, vbLF, "")
    HTMLstring = Replace(HTMLstring, "<P>", Chr(7))
    Set RegularExpressionObject = New RegExp
   
    With RegularExpressionObject
    .Pattern = "<[^>]+>"
    .IgnoreCase = True
    .Global = True
    End With
   
    stripHTML= RegularExpressionObject.Replace(HTMLstring, "")
    Set RegularExpressionObject = nothing

End Function

 

thanks

Reply #3 Top

Dont know why but for some reason it finally started working 

I changed it to

l-s-10500 for the header

and

l-s-5950 for the trailer and it started working

 

thanks for your input